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Preface 

to the Second Edition 


When I wrote the first edition of Cybemetics some thirteen years 
ago, I did it under some serious handicaps which had the effect of 
piling up unfortunate typographical errors, together with a few errors 
of content. Now I believe the time has come to reconsider cyber- 
netics, not merely as a program to be carried out at some period in 
the future, but as an existing Science. I have therefore taken this 
opportunity to put the necessary corrections at the disposal of my 
readers and, at the same time, to present an amplification of the 
present status of the subject and of the new related modes of thought 
which have come into being since its first publication. 

If a new scientific subject has reai vitality, the center of interest 
in it must and shouid shift in the course of years. When I first 
wrote Cybemetics, the chief obstacies which I found in making my 
point were that the notions of statistical information and control 
theory were novel and perhaps even shocking to the established 
attitudes of the time. At present, they have become so familiar as 
a tool of the communication engineers and of the designers of auto- 
matic Controls that the chief danger against which I must guard is 
that the book may seem trite and commonplace. The role of feed- 
back both in engineering design and in biology has come to be well 
established. The role of information and the technique of measuring 
and transmitting information constitute a whole discipline for the 
engineer, for the physiologist, for the psychologist, and for the 
sociologist. The autómata which the first edition of this book 
barely forecast have come into their own, and the related social 
dangers against which I warned, not only in this book, but also in its 
small popular companion The Human Use of Human Being s, 1 have 
risen well above the horizon. 

1 Wiener, N., The Human Use of Human Beings; Cybemetics andSociety, Houghton 
Mifflin Company, Boston, 1950. 
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Thus it behooves the cyberneticist to move on to new fields and to 
transfer a large part of his attention to ideas which have arisen in the 
deveiopments of the last decade. The simple linear feedbacks, the 
study of which was so important in awakening scientists to the role of 
cybernetic study, now are seen to be far less simple and far less linear 
than they appeared at first view. Indeed, in the early days of electric 
Circuit theory, the mathematical resources for systematic treatment 
of Circuit networks did not go beyond linear juxtapositions of resis- 
tances, capacities, and inductances. This meant that the entire 
subject could be adequately described in terms of the harmonic 
analysis of the messages transmitted, and of the impedances, ad- 
mittances, and voltage ratios of the circuits through which the 
messages were passed. 

Long before the publication of Cybemetics , it carne to be realized 
that the study of non-linear circuits (such as we find in many am- 
plifiers, in voltage limiters, in rectifiers, and the like) did not fit easily 
into this frame. Nevertheless, for want of a better methodology, 
many attempts were made to extend the linear notions of the older 
electrical engineering well beyond the point where the newer types of 
apparatus could be naturally expressed in terms of these. 

When I carne to M.I.T. around 1920, the general mode of putting 
the questions concerning non-linear apparatus was to look for a direct 
extensión of the notion of impedanee which would cover linear as well 
as non-linear systems. The result was that the study of non-linear 
electrical engineering was getting into a State comparable with that 
of the last stages of the Ptolemaic system of astronomy, in which 
epicycle was piled on epicycle, correction upon correction, until a vast 
patchwork structure ultimately broke down under its own weight. 

Just as the Copernican system aróse out of the wreck of the over- 
strained Ptolemaic system, with a simple and natural heliocentric 
description of the motions of the heavenly bodies instead of the com- 
plicated and unperspicuous Ptolemaic geocentric system, so the study 
of non-linear structures and systems, whether electric or mechanical, 
whether natural or artificial, has needed a fresh and independent 
point of commencement. I have tried to initiate a new approach in 
my book Nonlinear Problems in Random Theory, l It turns out that 
the overwhelming importance of a trigonometric analysis in the 
treatment of linear phenomena does not persist when we come to 
consider non-linear phenomena. There is a clear-cut mathematical 
reason for this. Electrical Circuit phenomena, like many other 

1 Wiener, N., Nonlinear Problema in Random Theory, The Technology Press of 
M.I.T. and John Wiley & Sons, Inc., New York, 1958. 



physieal phenomena, are charaeterized by an invariance with respect 
fco a shift of origin in time. A physieaí experiment which will have 
arrived at a eertain stage by 2 o’elock if we started at noon, will have 
arrived at the same stage at 2:15 if we started at 12:15. Thus the 
laws of physics concern invariants of the translation group in time. 

The trigonometric funetions sin nt and eos nt show eertain im- 
portant invariants with respect to the same translation group. The 
general function 

will go into the function 

g<iu(t+r) — gltuT gicut 

of the same form under the translation which we obtain by adding r 
to í. As a consequence, 
a eos n(t +t) 4* b sin w(í + r) 

= (a eos nr + b sin nr) eos nt+ (b eos nr — a sin nr) sin nt 
- ai eos nt + bi sin nt 
In other words, the families of funetions 
Ae iu,t 

and 

A eos ojt + B sin ojt 
are invariant under translation. 

Now there are other families of funetions which are invariant under 
translations. If I consider the so-called random walk in which the 
movement of a partióle under any time interval has a distribution 
dependent only on the length of that time interval and independent 
of everything that has happened up to its initiation, the ensemble of 
random walks will also go into itself under the time translation. 

In other words, the mere translational invariance of the trigo¬ 
nometric curves is a property shared by other sets of funetions as 
well. 

The property which is characteristic of the trigonometric funetions 
in addition to these invariants is that 

Ae iwt -f Be iut = (^4 + B)e iut 

so that these funetions form an extremely simple linear set. It will 
be noted that this property concerns linearity; that is, that we can 
reduce all oscillations of a given frequeney to a linear combination of 
two. It is this specific property which creates the valué of harmonio 
analysis in the treatment of the linear properties of electric circuits. 
The funetions 
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are eharacters of the transiation group and yield a linear repreaenta- 
tion of this group. 

When, however, we deaí with combinations of functions ofcher than 
addition with constant coefficients—when for example we multiply 
two functions by one another—the simple trigonometric functions no 
longer show this elementary group property. On the other hand, the 
random functions such as appear in the random walk do have certain 
properties very suitable for the discussion of their non-linear com¬ 
binations. 

It is scarcely desirable for me to go into the detail of this work here, 
for it is mathematically rather eomplicated, and it is covered in my 
book Nonlinear Problems in Random Theory. The material in that 
book has already been put to considerable use in the discussion of 
specific non-linear problems, but much remains to be done in carrying 
out the program laid down there. What it amounts to in practico is 
that an appropriate test input for the study of non-linear systems is 
rather of the character of the Brownian motion than a set of trigo¬ 
nometric functions. This Brownian motion function in the case of 
electric circuíts can be generated physically by the shot effect. This 
shot effect is a phenomenon of irregularity in electrical cúrrente 
which arises from the fact that such currents are carried not as a 
continuous stream of electricity but as a sequence of indivisible and 
equal electrons. Thus electric currents are subject to statistical 
irregularitiea which are themselves of a certain uniform character and 
which can be amplified up to the point at which they constitute an 
appreciable random noise. 

As I shall show in Chapter IX, this theory of random noise can be 
put into practica! use not merely for the analysis of electrical circuits 
and other non-linear processes but for their synthesis as well. 1 Th© 
device which is used is the reduction of the output of a non- 
linear instrument with random input to a weli-defined series of 
certain orthonormal functions which are closely related to the 
Hermite poly no miáis. The problem of the analysis of a non-linear 
Circuit consists in the determination of the coefficients of these 
polynomials in certain parameters of the input by a procesa of 
averaging. 

The description of this process is rather simple. In addition to the 
black box which represents an as yet unanalyzed non-linear system, 
I have certain bodies of known structure which I shall cali white 

1 Here I am using the term “non-linear system’’ not to exciude linear systems but to 
inelude a larger category of systems. The analysis of non-linear systems by means of 
random noise is also applicabie to linear systems and has been so used. 
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boxes represen ting the various terms in the expansión desi red. 1 I 
put the same random noise into the biack box and into a given white 
box. The coefficient of the white box in the deveiopment of the 
biack box is given as an average of the product of their outputs. 
While this average is taken over the entire ensembie of shot-effect 
inputs, there is a theorem whieh aliows us to repiace this average in 
all but a set of cases of probabiiity 0 by an average taken over time. 
To obtain this average, we need to have at our disposai a multipiying 
instrument by which we can get the product of the outputs of the 
biack and the white box, as weli as an averaging instrument, which 
we can base on the fact that the potential across a condenser is 
proportional to the quantity of electricity held in the condenser and 
henee to the time integral of the current flowing through it. 

Not only is it possible to determine the coefficients of each white 
box constituting an additive part of the equivalent representation of 
the biack box one by one, but it is also possible to determine these 
quantities simultaneously. It is even possible by the use of appro- 
priate feedback devices to make each one of the white boxes auto- 
matically adjust itself to the level corresponding to its coefficient in 
the deveiopment of the biack box. In this manner we are able to 
construct a múltiple white box which, when it is properly connected to 
a biack box and is subjected to the same random input, will auto- 
matically form itself into an operational equivalent of the biack box 
even though its internal structure may be vastly different. 

These operations of analysis, synthesis, and automatic self- 
adjustment of white boxes into the likeness of biack boxes can be 
carried out by other methods which have been described by Professor 
Amar Bose 2 and by Professor Gabor. 3 In all of them there is a use 
of some procese of working in, or learning, by choosing appropriate 

1 The terms “biack box” and “white box" are convenient and figurativo expres- 
sions of not very well determinad usage. I shall understand by a biack box a piece 
of apparatus, such aa four-terminai networka with two input and two output ter¬ 
mináis, which performs a definite operation on the present and paat of the input 
potential, but for which we do not necessarily have any information of the structure by 
which this operation is performed. On the other hand, a white box will be a similar 
network in which we have built in the relation between input and output potentials 
in accordance with a definite structural plan for securing a previously determined 
input-output relation. 

* Bose, A. G., “Nonlinear System Characterization and Optimization,” IRE 
Tranaactiona on Information Theory, IT-5, 30-40 (1959) (Special supplement to IRE 
Transactions). 

8 Gabor, D., “Electronic Inventions and Their Impact on Civilization,” Inaugural 
Lecture, March 3, 1959, Imperial College of Science and Technology, University of 
London, England. 
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inputs for the black and white boxea and comparing them; and in 
many of these processes, including the method of Professor Gabor, 
muitiplicatíon devices play an important role. While there are 
many approaches to the problem of multiplying two functions 
electricaliy, this task is not technically easy. On the one hand, a 
good multipiier must work o ver a large range of amplitudes. On the 
other hand, it must be so nearly instantaneous in its operation that it 
will be aecurate up to high frequencies. Gabor claims for his 
multipiier a frequency range running to about 1000 óyeles. In his 
inaugural dissertation for the chair of Professor of Electrical Engin- 
eering at the Imperial College of Science and Technology of the 
University of London, he does not State explicitly the amplitude range 
over which his method of muitiplicatíon is vaiid ñor the degree of 
accuracy to be obtained. I am awaiting very eagerly an explicit 
statement of these properties so that we can give a good evaluation of 
the multipiier for use in other pieces of apparatus dependent on it. 

All of these devices in which an apparatus assumes a specific 
structure or function on the basis of past experience lead to a very 
interesting new attitude both in engineering and in biology. In 
engineering, devices of similar character can be used not only to play 
games and perforan other purposive acts but to do so with a continual 
improvement of performance on the basis of past experience. I shall 
discuss some of these possibilities in Chapter IX of this book. 
Biologically, we have at least an analogue to what is perhaps the 
central phenomenon of life. For heredity to be possible and for cells 
to multiply, it is necessary that the heredity-carrying componente of 
a cell—the so-called genes—be able to construct other similar 
heredity-carrying structures in their own image. It is, therefore, 
very exciting for us to be in possession of a means by which engineer¬ 
ing structures can produce other structures with a function similar to 
their own. I shall de vote Chapter X to this, and in particular shall 
discuss how oscillating systems of a given frequency can reduce other 
osciiiating systems to the same frequency. 

It is often stated that the production of any specific kind of mole- 
cuie in the image of existing ones has an analogy to the use of tom¬ 
piates in engineering whereby we can use a functional element of a 
machine as the pattern on which another similar element is made. 
The image of the témplate is a static one, and there must be some 
process by which one gene molecuie manufactures another. I give 
the tentativo suggestion that frequencies, let us say the frequencies of 
molecular spectra, may be the pattern elements which carry the 
identity of biological substances; and the seíf-organization of genes 
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may be a manifestation of the self-organization of frequencies which I 
shall diseuss iater. 

I have aiready spoken of learning machines in a general way. I 
shall devote a ehapter to a more detailed discussion of these machines 
and potentialities and some of the problems of their use. Here I 
wish fco make a few comments of a general nature. 

As will be seen in Chapter I, the notion of learning machines is as 
oíd as cybernetics itself. In the anti-aircraft predictors which I 
described, the linear characteristics of the predictor which is used at 
any given time depend on a long-time acquaintance with the statistics 
of the ensemble of time series which we desire to predict. While a 
knowledge of these characteristics can be worked out mathematically 
in aecordance with the principies which I have given there, it is 
perfectly possible to devise a Computer which will work up these 
statistics and develop the short-time characteristics of the predictor 
on the basis of an experience which is aiready observed by the same 
machine as is used for prediction and which is worked up auto- 
matically. This can go far beyond the purely linear predictor. In 
various papera by Kallianpur, Masani, Akutowicz, and myself, 1 we 
have developed a theory of non-linear prediction which can at least 
conceivably be mechanized in a similar manner with the use of long- 
time observations to give the stafcistical basis for short-time predic- 
fcion. 

The theory of linear prediction and of non-linear prediction both 
invoive some criteria of the goodness of fit of the prediction. The 
simplest criterion, although by no means the only usable one, is that 
of minimizing the mean square of the error. This is used in a 
particular form in connection with the functionals of the Brownian 
motion which I employ for the construction of non-linear apparatus, 
inasmuch as the various terms of my development have certain 
orthogonalifcy properties. These ensure that the partial sum of a 
imite number of these terms is the best simulation of the apparatus to 
be imitated, which can be made by the employment of these terms if 
the mean square criterion of error is to be maintained. The work of 
Gabor also depends upon mean square criterion of error, but in a 
more general way, applicable to time seríes obtained by experience. 

The notion of learning machines can be extended far beyond its 

1 Wiener, N., and P. Masani, “The Prediction Theory of Multivariate Stochastic 
Procesaos,” Part I, Acta McUhematica, 98, 111-150 (1967); Part II, ibid., 99, 93-137 
(1958). Also Wiener, N., and E. J. Akutowicz, “The Definition and Ergodíc Prop¬ 
erties of the Stochastic Adjoint of a Unitary Transformation,” Rendiconti del Circolo 
Matemático di Palermo, Ser. II, VI, 205-217 (1957). 
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employment for predicfcors, filters, and other similar apparatus. It 
is parfcicularly imporfcant for the study and construction of machines 
which play a competitive game like checkers. Here the vital work 
has been done by Samuel 1 and Watanabe 2 at the laboratories of the 
International Business Machines Corporation. As in the case of 
fiiters and predictors, certain functions ofthe time series are developed 
in terms of which a much larger class of functions can be expanded. 
These functions can have numerical evaluations of the significant 
quantities on which the successful playing of a game depends. For 
example, they comprise the number of pieces on both sides, the total 
command of these pieces, their mobility, and so forth. At the be- 
ginning of the employment of the machine, these various considera- 
tions are given tentativo weightings, and the machine chooses that 
admissible move for which the total weighting will have a máximum 
valué. U" to this point, the machine has worked with a rigid 
program and has not been a learning machine. 

However, at times the machine assumes a different task. It tries 
to expand that function which is 1 for won games, 0 for lost games, 
and perhaps \ for drawn games in terms of the various functions 
expressing the considerations of which the machine is able to take 
cognizance. In this way, it redetermines the weightings of these 
considerations so as to be able to play a more sophisticated game. 
I shall discuss some of the properties of these machines in Chapter IX, 
but here I must point out that they have been sufficiently successful 
for the machine to be able to defeat its programmer in from 10 to 20 
hours of learning and working in. I also wish to mention in that 
chapter some of the work that has been done on similar machines 
devised for proving geometrical theorems and for simulating, to a 
limited extent, the logic of induction. 

AII of this work is a part of the theory and practice of the program- 
ming of programming, which has been extensively studied in the 
Electronic Systems Laboratory of the Massachusetts Institute of 
Technology. Here it has been found cut that unless some such 
learning device is employed, the programming of a rigidly patterned 
machine is itself a very difficult task and that there is an urgent need 
for devices to program this programming. 

Now that the concept of learning machines is applicable to those 
machines which we have made ourselves, it is also relevant to those 

1 Samuel, A. L., “ Some Studies ¡n Machine Learning, Using the Game of Checkers,” 
IBM Journal of Research and Development, 3, 210-229 (1959). 

2 Watanabe, S., “Information Theoretical Analysia of Multivariate Correlation,” 
IBM Journal of Research and Developmenl, 4 , 66-82 (1960). 
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living machines which we cali animáis, so that we have the possibility 
of throwing a new light on biological cybernetics. Here I wish to 
single out, among a variety of current investigations, a book by the 
Stanley-Jones on the Kybernetics (notice the spelling) of living 
systems. 1 In this book they devote a great deai of attention to those 
feedbacks which maintain the working level of the nervous system as 
well as those other feedbacks which respond to special stimuli. 
Since the combination of the level of the system with the particular 
responses is to a considerable extent multiplicative, it is also non- 
linear and involves considerations of the sort we have aiready 
brought out. This field of activity is very much alive at present, 
and I expect it to become much more alive in the near future. 

The methods of memory machines and of machines that multiply 
themselves which I have so far given are largely, although not 
entirely, those which depend on apparatus of a high degree of speci- 
ficity, or of what I may cali biueprint apparatus. The physiologicat 
aspects of the same process must conform more to the peculiar 
techniques of living organisms in which blueprínts are replaced by a 
less specific process, but one in which the system organizes itself. 
Chapter X of this book is devoted to a sample of a self-organizing 
process, namely, that by which narrow, highly specific frequencies 
are formed in brain waves. It is, therefore, largely the physiological 
counterpart of the previous chapter, in which I am discussing similar 
processes on more of a biueprint basis. This existence of sharp 
frequencies in brain waves and the theories which I gave to explain 
how they are originated, what they can do, and what medical use 
may be made of them represent in my mind an important and new 
break-through in physiology. Similar ideas can be used in many 
other places in physiology and can make a real contribution to the 
study of the fundamentáis of life phenomena. In this field, what I 
am giving is more a program than work aiready achieved, but it is a 
program for which I have great hopes. 

It has not been my intention, either in the first edition or in the 
present one, to make this book a compendium of all that has been 
done in cybernetics. Neither my interests ñor my abilities lie that 
way. My intention is to express and to amplify my ideas on this 
subject, and to display some of the ideas and philosophical reftections 
which led me in the beginning to enter upon this field, and which 
have continued to interest me in its development. Thus it is an 
intensely personal book, devoting much space to those developments 

1 Stanley-Jones, D., and K. Stanley-Jones, Kybernetics of Natural Systems, A Study 
in Puttems of Control, Pergamon Preta, London, 1960. 
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in which I myself ha ve been interested, and relatively little to those 
in which I have not worked myself. 

I have had valuable help from many quarters in revising this book. 
I must acknowledge in particular the cooperation of Miss Constance 
D. Boyd of The M.I.T. Press, Dr. Shikao Ikehara of the Tokyo 
Institute of Technology, Dr. Y. W. Lee of the Electrical Engineering 
Department of M.I.T., and Dr. Gordon Raisbeck of the Bell 
Telephone Laboratories. Also, in the writing down of my new 
chapters, and particularly in the computations of Chapter X, in 
which I have considered the case of self-organizing systems which 
manifest themselves in the study of the electroencephalogram, I wish 
to mention the aid which I received from my students, John C. 
Kotelly and Charles E. Robinson, and especially the contribution of 
Dr. John S. Barlow of the Massachusetts General Hospital. The 
indexing was done by James W. Davis. 

Without the meticulous care and devotion of all of these I would 
not have had either the courage or the accuracy to turn out a new and 
corrected edition. 

Norbert Wiener 

Cambridge, Massachusetta, 

March, 1961 
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Introduction 


This book representa the outcome, after more than a decade, of a 
program of work undertaken jointiy with Dr. Arturo Rosenblueth, 
then of the Harvard Medical School and now of the Instituto 
Nacional de Cardiología of México. In those days, Dr. Rosenblueth, 
who was the colleague and collaborator of the late Dr. Walter B. 
Cannon, conducted a monthly series of discussion meetings on scien- 
tific method. The participants were mostly young scientists at the 
Harvard Medical School, and we would gather for dinner about a 
round table in Vanderbilt Hall. The conversation was lively and 
unrestrained. It was not a place where it was either encouraged or 
made possible for anyone to stand on his dignity. After the meal, 
somebody—either one of our group or an invited guest—would read 
a paper on some scientific topic, generally one in which questions of 
methodology were the first consideration, or at least a leading 
consideration. The speaker had to run the gauntlet of an acute 
criticism, good-natured but unsparing. It was a perfect catharsis 
for half-baked ideas, insufficient self-criticism, exaggerated self- 
confidence, and pomposity. Those who could not stand the gaff did 
not retum, but among the former habitúes of these meetings there is 
more than one of us who feels that they were an important and 
permanent contribution to our scientific unfolding. 

Not all the participants were physicians or medical scientists. 
One of us, a very steady member, and a great help to our discussions, 
was Dr. Manuel Sandoval Vallarta, a Mexican like Dr. Rosenblueth 
and a Professor of Physics at the Massachusetts Institute of Tech¬ 
nology, who had been among my very first students when I carne to 
the Institute after World War I. Dr. Vallarta used to bring some 
of his M.I.T. colleagues along to these discussion meetings, and it 
was at one of these that I first met Dr. Rosenblueth. I had been 
interested in the scientific method for a long time and had, in fact, 
been a participant in Josiah Royce’s Harvard seminar on the subject 
i 
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in 1911-1913. Moreover, it was felt that it was essential to have 
eomeone present who could examine mathematical questions 
critically. I thus became an active member of the group until Dr. 
Rosenblueth’s cali to México in 1944 and the general confusión of the 
war ended the series of meetings. 

For many years Dr. Rosenblueth and I had shared the convic- 
tion that the most fruitful areas for the growth of the Sciences were 
those which had been neglected as a no-man’s land between the 
various established fields. Since Leibniz there has perhaps been no 
man who has had a ful! command of all the intellectual activity of hia 
day. Since that time, Science has been increasingiy the task of 
specialists, in fields which show a tendency to grow progressively 
narrower. A century ago there may have been no Leibniz, but there 
was a Gauss, a Faraday, and a Darwin. Today there are few acholare 
who can cali themselves mathematicians or physicists or biologists 
without restriction. A man may be a topologist or an acoustician or 
a coleopterist. He will be filled with the jargon of his field, and will 
know all its literature and all its ramifications, but, more frequently 
than not, he will regard the next subject as something belonging to 
his colleague three doors down the corridor, and will consider any 
interest in it on his own part as an unwarrantable breach of privacy. 

These specialized fields are continually growing and invading new 
territory. The result is like what occurred when the Oregon country 
was being invaded simultaneousiy by the United States settlers, the 
British, the Mexicans, and the Russians—an inextricable tangle of 
exploration, nomenclature, and laws. There are fields of scientifie 
work, as we shall see in the body of this book, which have been 
explored from the different sides of puré mathematics, statistics, 
electrical engineering, and neurophysiology; in which every single 
notion receives a sepárate ñame from each group, and in which 
important work has been triplicated or quadruplicated, while still 
other important work is delayed by the unavailability in one field of 
resulta that may have already become classical in the next field. 

It is these boundary regions of science which offer the richeat 
opportunities to the qualified investigator. They are at the same 
time the most refractory to the accepted techniques of mass attack 
and the división of labor. If the difficulty of a physiological problem 
is mathematical in essence, ten physiologists ignorant of mathematics 
will get precisely as far as one physiologist ignorant of mathematics, 
and no further. If a physiologist who knows no mathematics works 
together with a mathematician who knows no physioiogy, the one 
will be unabíe to state his problem in terms that the other can manip- 
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ulate, and the second will be unable to put fche answers in any form 
that the first can understand. Dr. Rosenblueth has always insisted 
that a proper exploration of these blank spaces on the map of Science 
could only be made by a team of scientists, each a speciaiist in his 
own fieid but each possessing a thoroughly sound and trained 
acquaintance with the fields of his neighbors; all in the habit of work- 
ing together, of knowing one another’s intellectual customs, and of 
recognizing the significance of a coileague’s new suggestion before it 
has taken on a full formal expression. The mathematician need not 
have the skili to conduct a physiologicai experiment, but he musthave 
the skill to understand one, to criticize one, and to suggest one. The 
physiologist need not be able to prove a certain mathematicai 
theorem, but he must be able to grasp its physiologicai significance 
and to tell the mathematician for what he should look. We had 
dreamed for years of an institution of independent scientists, working 
together in one of these backwoods of Science, not as subordinates of 
some great executive officer, but joined by the desire, indeed by the 
spiritual necessity, to understand the región as a whole, and to lend 
one another the strength of that understanding. 

We had agreed on these matters long before we had chosen the 
fieid of our joint investigations and our respective parta in fchem. 
The deciding factor in this new step was the war. I had known for 
a considerable time that if a national emergency should come, my 
function in it would be determined largely by two thir.gs: my cióse 
contact with the program of computing machines developed by Dr. 
Vannevar Bush, and my own joint work with Dr. Yuk Wing Lee on 
the design of electric networks. In fact, both proved important. 
In the summer of 1940,1 tumed a large part of my attention to the 
development of computing machines for the solution of partial 
differential equations. I had long been interested in these and had 
convinced myself that their chief problem, as contrasted with the 
ordinary differential equations so well treated by Dr. Bush on his 
differential analyzer, was that of the representation of functions of 
more than one variable. I had also become convinced that the pro¬ 
cesa of scanning, as employed in televisión, gave the answer to that 
question and, in fact, that televisión was destined to be more useful 
to engineering by the introduction of such new techniques than as an 
independent industry. 

It was clear that any scanning procesa must vastly increase the 
number of data dealt with as compared with the number of data in a 
problem of ordinary differential equations. To accomplish reason- 
able resulta in a reasonable time, it thus became necessary to push 
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the speed of fche elementary processes to the máximum, and to avoid 
interrupting the stream of these processes by steps of an essentially 
slower nature. It also became necessary to perform the individual 
processes with so high a degree of accuracy that the enormous 
repetition of the elementary processes should not bring about a 
cumulative error so great as to swamp all accuracy. Thus the follow- 
ing requirements were suggested: 

1. That the central adding and multiplying apparatus of the 
computing machine should be numerical, as in an ordinary adding 
machine, rather than on a basis of measurement, as in the Bush 
differential analyzer. 

2. That these mechanisms, which are essentially switching devices, 
should depend on electronic tubes rather than on gears or mechanical 
relays, in order to secure quicker action. 

3. That, in accordance with the poiicy adopted in some existing 
apparatus of the Bell Telephone Laboratories, it would probably be 
more economical in apparatus to adopt the scale of two for addition 
and multiplication, rather than the scale of ten. 

4. That the entire sequence of operations be laid out on the 
machine itself so that there should be no human intervention from 
the time the data were entered until the final results should be taken 
off, and that all logical decisions necessary for this should be built 
into the machine itself. 

5. That the machine contain an apparatus for the storage of data 
which should record them quickly, hold them firmly until erasure, 
read them quickly, erase them quickly, and then be immediately 
available for the storage of new material. 

These recommendations, together with tentative suggestions for 
the means of realizing them, were sent in to Dr. Vannevar Bush for 
their possible use in a war. At that stage of the preparations 
for war, they did not seem to ha ve sufficiently high priority to make 
immediate work on them worth whiie. Nevertheless, they all 
represent ideas which have been incorporated into the modern 
ultra-rapid computing machine. These notions were all very much 
in the spirit of the thought of the time, and I do not for a moment 
wish to claim anything like the solé responsibility for their introduc- 
tion. Nevertheless, they have proved useful, and it is my hope 
that my memorándum had some effeet in popularizing them among 
engineers. At any rate, as we shali see in the body of the book, they 
are all ideas which are of interest in coñnection with the study of the 
nervous system. 
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This work was thus laid on the table, and, aíthough it has not 
proved to be fruitless, it led to no immediate project by Dr. Rosen - 
bluefch and myseif. Our actual collaboration resulted from another 
project, which was likewise undertaken for the purposes of the last 
war. At the beginning of the war, the Germán prestige in aviation 
and the defensive position of England turned the attention of many 
scientists to the improvement of anti-aircraft artiilery. Even before 
the war, it had become clear that the speed of the airplane had 
rendered obsolete all classical methods of the direction of fire, and 
that it was necessary to build into the control apparatus all the 
computations necessary. These were rendered much more difficult 
by the fact that, unlike all previously encountered targets, an airplane 
has a velocity which is a very appreciable part of the velocity of the 
missile used to bring it down. Accordingly, it is exceedingiy im- 
portant to shoot the missile, not at the target, but in such a way that 
missile and target may come together in space at some time in the 
futuro. We must henee find some method of predicting the future 
position of the plañe. 

The simplest method is to extrapólate the present course of the 
plañe along a straight line. This has much to recommend it. The 
more a plañe doubles and curves in flight, the less is its effective 
velocity, the less time it has to accomplish a mission, and the longer 
it remains in a dangerous región. Other things being equal, a plañe 
will fly as straight a course as possible. However, by the time the 
first shell has burst, other things are not equal, and the pilot will 
probably zigzag, stunt, or in some other way take evasive action. 

If this action were completely at the disposal of the pilot, and the 
pilot were to make the sort of intelligent use of his chances that we 
anticípate in a good poker player, for example, he has so much 
opportunity to modify his expected position before the arrival of a 
shell that we should not reckon the chances of hitting him to be very 
good, except perhaps in the case of a very wasteful barrage fire. On 
the other hand, the pilot does not have a completely free chance to 
maneuver at his will. For one thing, he is in a plañe going at an 
exceedingiy high speed, and any too sudden deviation from his course 
will produce an acceleration that will render him unconscious and 
may disintegrate the plañe. Then too, he can control the plañe oniy 
by moving his control surfaces, and the new regimen of flow that 
is established takes some smail time to develop. Even when it is 
fully developed, it merely changes the acceleration of the plañe, and 
this change of acceleration must be con verted, first into change of 
velocity and then into change of position, before it is finally effective. 
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Moreover, an aviator under the strain of combat conditions is 
scarcely in a mood to engage in any very complicafced and untram- 
meled volunfcary behavior, and is quite likely to foilow out the pattern 
of activity in which he has been trained. 

All this made an investigation of the problem of the curvilinear 
predietion of flight worth while, whether the results should prove 
favorable or unfavorable for the actual use of a control apparatus 
involving such curvilinear predietion. To predict the fu ture of a 
curve is to carry out a certain operation on its past. The trae 
predietion operator cannot be realized by any constructible apparatus; 
but there are certain operators which bear it a certain resemblance 
and are, in facfc, realizable by apparatus which we can build. I 
suggested to Professor Samuel Caldwell of the Massachusetts Institute 
of Technology that these operators seemed worth trying, and he 
immediately suggested that we try them out on Dr. Bush’s differential 
analyzer, using this as a ready-made model of the desired fire- 
control apparatus. We did so, with results which will be discussed 
in the body of this book. At any rate, I found myself engaged in a 
war project, in which Mr. Julián H. Bigelow and myself were partners 
in the investigation of the theory of predietion and of the construction 
of apparatus to embody these theories. 

It will be seen that for the second time I had become engaged in 
the study of a mechanico-electrical system which was designed to 
usurp a specifically human function—in the first case, the execution 
of a complicated pattern of computation, and in the second, the 
forecasting of the future. In this second case, we should not avoid 
the discus8Íon of the performance of certain human functions. In 
some fire-control apparatus, it is true, the original impulse to point 
comes in directly by radar, but in the more usual case, there is a 
human gun-pointer or a gun-trainer or both coupled into the fire- 
control system, and acting as an essential part of it. It is essential to 
know their characteristics, in order to incorpórate them mathe- 
matically into the machines they control. Moreover, their target, 
the plañe, is also humanly controlled, and it is desirable to know its 
performance characteristics. 

Mr. Bigelow and I carne to the conclusión that an extremely 
important factor in voluntary activity is what the control engineers 
term feedback. I shall discuss this in considerable detail in the 
appropriate chapters. It is enough to say here that when we 
desire a motion to foilow a given pattern the difference between this 
pattern and the actually performed motion is used as a new inpufc to 
cause the part regulated to move in such a way as to bring its 
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motion closer to that given by the patfcern. For example, one form 
of sfceering engine of a ship carnes the reading of the wheel to an 
offset from the tiller, which so regulates the valves of the steering 
engine as to move the tiller in sueh a way as to tum these valves off. 
Thus the tiller turna so as to bring the other end of the valve- 
regulating offset amidships, and in that way registers the angular 
position of the wheel as the angular position of the tiller. Clearly, 
any friction or other delaying forcé which hampers the motion of the 
tiller will increase the admission of steam to the valves on one side 
and will decrease it on the other, in such a way as to increase the 
torque tending to bring the tiller to the desired position. Thus the 
feedback system tends to make the performance of the steering engine 
relativeiy independent of the load. 

On the other hand, under certain conditions of delay, etc., a 
feedback that is too brusque will make the rudder overshoot, and 
will be followed by a feedback in the other direction, which makes the 
rudder overshoot still more, until the steering mechanism goes into 
a wild oscillation or hunting, and breaks down completely. In a 
book such as that by MacColl, 1 we find a very precise discussion of 
feedback, the conditions under which it is advantageous, and the 
conditions under which it breaks down. It is a phenomenon which 
we understand very thoroughly from a quantitative point of view. 

Now, suppose that I pick up a lead pencil. To do this, I ha ve to 
move certain muscles. However, for all of us but a few expert 
anatomists, we do not know what these muscles are; and even among 
the anatomists, there are few, if any, who can perform the act by a 
conscious willing in succession of the contra ction of each muscle 
concemed. On the contrary, what we will is to pick the pencil up. 
Once we have determined on this, our motion proceeds in such a way 
that we may say roughly that the amount by which the pencil is not 
yet picked up is decreased at each stage. This part of the action is 
not in full consciousness. 

To perform an action in such a manner, there must be a report to 
the nervous system, conscious or unconscious, of the amount by 
which we have failed to pick up the pencil at each instant. If we 
have our eye on the pencil, this report may be visual, at least in part, 
but it is more generaliy kinesthetic, or, to use a term now in vogue, 
proprioceptive. If the proprioceptive sensations are wanting and 
we do not replace them by a visual or other substitute, we are 
unable to perform the act of picking up the pencil, and find ourselves 

1 MacColl, L. A., Fundamental Theory of Servomechanisms, Van Noatrand, New 
York, 1946. 
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in a state of what is known as ataxia. An ataxia of this type is 
familiar in the form of syphilis of the central nervous sysfcem known 
as tabes dorsalis, where the kinesthetic sense conveyed by the spinal 
nerves is more or leas destroyed. 

However, an exeessive feedback is likely to be as serious a handicap 
to organized activity as a defective feedback. In view of this 
possibility, Mr. Bigelow and myself approached Dr. Rosenblueth 
with a very specific question. Is there any pathological condition 
in which the patient, in trying to perform some voluntary act like 
picking up a pencil, overshoots the mark, and goes into an uncontrol- 
lable oscillation? Dr. Rosenblueth immediately answered us that 
there is such a well-known condition, that it is called purpose tremor, 
and that it is often associated with injury to the cerebellum. 

We thus found a most significant confirmation of our hypothesis 
concerning the nature of at least some voluntary activity. It wili 
be noted that our point of view considerably transcended that current 
among neurophysiologists. The central nervous system no longer 
appears as a self-contained organ, receiving inputs from the senses 
and discharging into the muscles. On the contrary, some of its 
most characteristic activities are explicable only as circular processes, 
emerging from the nervous system into the muscles, and re-entering 
the nervous system through the sense organs, whether they be 
proprioceptors or organs of the special senses. This seemed to us to 
mark a new step in the study of that part of neurophysiology which 
concerns not soiely the elementary processes of nerves and synapses 
but the performance of the nervous system as an integrated whole. 

The three of us felt that this new point of view merited a paper, 
which we wrote up and published . 1 Dr. Rosenblueth and I foresaw 
that this paper could be only a statement of program for a large 
body of experimental work, and we decided that if we could ever 
bring our plan for an interscientific institute to fruition, this topic 
would furnish an almost ideal center for our activity. 

On the communication engineering plañe, it had already become 
clear to Mr. Bigelow and myself that the problems of control engineer¬ 
ing and of communication engineering were inseparable, and that 
they centered not around the technique of eléctrica! engineering but 
around the much more fundamental notion of the message, whether 
this should be transmitted by electrical, mechanical, or nervous 
means. The message is a discreto or continuous sequence of measur- 
able events distributed in time—precisely what is called a time series 

1 Rosenblueth, A., N. Wiener, and J. Bigelow, “Behavior, Purpose, and Teleology,” 
Philoaophy of Science, 10 , 18-24 (1943). 
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by the statisticians. The predietion of the fu ture of a message is 
done by some sort of operator on its past, whether this operator is 
realized by a scheme of mathematicai computation, or by a mechan!- 
cal or electrical apparatus. In this connection, we found that the 
ideal predietion mechanisms which we had at first contemplated were 
beset by two types of error, of a roughiy antagonistic nature. 
While the predietion apparatus which we at first designed could be 
made to anticípate an extremely smooth curve to any desired degree 
of approximation, this refinement of behavior was always attained 
at the cost of an increasing sensitivity. The better the apparatus 
was for smooth waves, the more it would be set into oscillation by 
small departures from smoothness, and the longer it would be before 
such oscillations would die out. Thus the good predietion of a 
smooth wave seemed to require a more delicate and sensitive appara¬ 
tus than the best possibíe predietion of a rough curve, and the 
choice of the particular apparatus to be used in a specific case was 
dependent on the statistical nature of the phenomenon to be pre- 
dicted. This interacting pair of types of error seemed to have 
something in common with the contrasting problems of the measure 
of position and of momentum to be found in the Heisenberg quantum 
mechanics, as described according to his Principie of Uncertainty. 

Once we had clearly grasped that the solution of the problem of 
optimum predietion was only to be obtained by an appeal to the 
statistics of the time series to be predicted, it was not difficult to 
make what had originally seemed to be a difficulty in the theory of 
predietion into what was actually an efficient tool for solving the 
problem of predietion. Assuming the statistics of a time series, it 
became possibíe to derive an explicit expression for the mean square 
error of predietion by a given technique and for a given lead. Once 
we had this, we could transíate the problem of optimum predietion to 
the determination of a specific operator which should reduce to a 
mínimum a specific positive quantity dependent on this operator. 
Minimization problems of this type belong to a recognized branch of 
mathematics, the calculus of variations, and this branch has a 
recognized technique. With the aid of this technique, we were able 
to obtain an explicit best solution of the problem of predicting the 
futuro of a time series, given its statistical nature, and even further, 
to achieve a physical realization of this solution by a constructible 
apparatus. 

Once we had done this, at least one problem of engineering 
design took on a completely new aspect. In general, engineering 
design has been held to be an art rather than a Science. By reducing 
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a problem of fchis sort to a minimization principie, we had eatablished 
the subject on a far more seientific basis. It occurred to us thafc this 
was not an isolated case, but that there was a whole región of 
engineering work in which similar design problems could be solved by 
the methods of the calculus of variations. 

We attacked and solved other similar problems by the same 
methods. Among these was the problem of the design of wave 
filters. We often find a message contaminated by extraneous 
disturbance8 which we cali background noise. We then face the 
problem of restoring the original message, or the message under a 
given lead, or the message modified by a given lag, by an operator 
applied to the corrupted message. The optimum design of this 
operator and of the apparatus by which it is realized dependa on the 
statistical nature of the message and the noise, singly and jointly. 
We thus have replaced in the design of wave filters processes which 
were formerly of an empirical and rather haphazard nature by 
processes with a thorough seientific justification. 

In doing this, we have made of communication engineering design 
a statistical Science, a branch of statistical mechanics. The notion 
of statistical mechanics has indeed been encroaching on every 
branch of Science for more than a century. We shall see that this 
dominance of statistical mechanics in modern physics has a very vital 
significance for the interpretation of the nature of time. In the 
case of communication engineering, however, the significance of the 
statistical element is immediately apparent. The transmission of 
information is impossible save as a transmission of alternatives. If 
only one contingency is to be transmitted, then it may be sent most 
efficiently and with the least trouble by sending no message at all. 
The teiegraph and the telephone can perform their function only if 
the messages they transmit are cootinually varied in a manner not 
completely determined by their past, and can be designed effectively 
only if the variation of these messages conforms to some sort of 
statistical regularity. 

To cover this aspect of communication engineering, we had to 
develop a statistical theory of the amount of information , in which the 
unit amount of information was that transmitted as a single decisión 
between equally probable alternatives. This idea occurred at about 
the same time to severa! writers, among them the statistician R. A. 
Fisher, Dr. Shannon of the Bell Telephone Laboratories, and the 
author. Fisher’s motive in studying this subject is to be found in 
classicaí statistical theory; that of Shannon in the problem of coding 
information; and that of the author in the problem of noise and 
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message in electrical filters. Let it be remarked parenthetically that 
some of my speculations in this direetion attach themselves to the 
earlier work of Kolmogoroff 1 in Russia, although a considerable part 
of my work was done before my attention was called to the work of 
the Russian school. 

The notion of the amount of information attaches itself very 
naturally to a classical notion in statistical mechamos: that of 
entropy. Just as the amount of information in a system is a measure 
of its degree of organization, so the entropy of a system is a measure 
of its degree of disorganization; and the one is simply the negative of 
the other. This point of view leads us to a number of considerations 
concerning the second law of thermodynamics, and to a study of the 
jíossibility of the so-called Maxwell demons. Such questions arise 
independently in the study of enzymes and other catalysts, and their 
study is essential for the proper understanding of such fundamental 
phenomena of living matter as metabolism and reproduction. The 
third fundamental phenomenon of life, that of irritability, belongs to 
the domain of communication theory and falls under the group of 
ideas we have just been discussing. 2 

Thus, as far back as four years ago, the group of scientists about 
Dr. Rosenblueth and myself had already become aware of the 
essential unity of the set of problems centering about communication, 
control, and statistical mechanics, whether in the machine or in 
living tissue. On the other hand, we were seriously hampered by the 
lack of unity of the literature concerning these problems, and by 
the absence of any common terminology, or even of a single ñame for 
the field. After much consideration, we have come to the conclusión 
that all the existing terminology has too heavy a bias to one side or 
another to serve the fu ture development of the field as well as it 
should; and as happens so often to scientists, we have been forced to 
coin at least one artificial neo-Greek expression to fill the gap. We 
have decided to cali the en tire field of control and communication 
theory, whether in the machine or in the animal, by the ñame 
Cybemetics, which we form from the Greek or steersman. 

In choosing this tena, we wish to recognize that the first significant 
paper on feedback mechanisms is an article on governors, which 
was published by Clerk Maxwell in 1868, 3 and that govemor is 

1 Kolmogoroff, A. N., “ Interpolación und Extrapolaron vou stationaren Zufiüligeii 
Folgen,” Bull. Acad. Sci. U.S.S.R., Ser. Math. 5, 3-14 (1941). 

- Schródinger, Erwin, What is Lije?, Cambridge University Press, Cambridge, 
England, 1945. 

3 Maxwell, J. C., Proc. Roy. Soc. ( London ), 16, 270-283, (1868). 
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derived from a Latín corruption of x^p^jrrjs. We also wish to 
refer to the fact that the steering engines of a ship are indeed one of 
the earliest and best-developed forma of feedback mechanisms. 

Although the term cybernetics does not date further back than the 
summer of 1947, we shail find it convenient to use in referring to 
earlier epochs of the development of the field. From 1942 or 
thereabouts, the development of the subject went ahead on severai 
fronts. First, the ideas of the joint paper by Bigelow, Rosenblueth, 
and Wiener were disseminated by Dr. Rosenblueth at a meeting held 
in New York in 1942, under the auspices of the Josiah Macy Founda¬ 
tion, and devoted to problema of central inhibition in the nervous 
system. Among those present at that meeting was Dr. Warren 
McCulloch, of the Medical School of the University of Illinois, who 
had already been in touch with Dr. Rosenblueth and myself, and who 
was interested in the study of the organizaron of the cortex of the 
brain. 

At this point there entera an eiement which occurs repeatedly in 
the history of cybernetics—the influence of mathematical logic. If I 
were to choose a patrón saint for cybernetics out of the history of 
Science, I should have to choose Leibniz. The philosophy of 
Leibniz centers about two ciosely related concepta—that of a universal 
symbolism and that of a calculus of reasoning. From these are 
descended the mathematical notation and the symbolic logic of the 
present day. Now, just as the calculus of arithmetic lends itself to a 
mechanization progressing through the abacus and the desk com- 
puting machine to the ultra-rapid computing machines of the present 
day, so the calculus ratiocinator of Leibniz contains the germs of the 
machina ratiocinatrix, the reasoning machine. Indeed, Leibniz 
himself, like his predecessor Pascal, was interested in the construction 
of computing machines in the metal. It is therefore not in the ieast 
surprising that the same intellectual impulse which has led to the 
development of mathematical logic has at the same time led to the 
ideal or actual mechanization of processes of thought. 

A mathematical proof which we can foliow is one which can be 
written in a finite number of symbols. These symbols, in fact, may 
make an appeal to the notion of infinity, but this appeal is one which 
we can sum up in a finite number of stages, as in the case of mathe¬ 
matical induction, where we prove a theorem depending on a param¬ 
eter n for n — i), and also pro ve that the case n + 1 follows from 
the case ??, thus estabiishing the theorem for all positive valúes of n. 
Moreover, the rules of operation of our deductive mechanism musfc 
be finite in number, even though they may appear to be otherwise, 
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through a referen.ee to the concept of infinity, which can itseíf be 
stated in finite terms. In short, ifc has become quite evident, both to 
the nominalista like Hilbert and to the intuitionists like Weyl, that 
the development of a mathematico-logical theory is subject to the 
same aort of restrictions as those that limit the performance of a 
computing machine. As we shall see later, it is even possible to 
interpret in this way the paradoxes of Cantor and of Eussell. 

I am myself a former student of Russeil and owe much to his 
infiuence. Dr. Shannon took for his doctor’s thesis at the Massa- 
chusetts Institute of Technology the application of the techniques of 
the classical Boolean algebra of classes to the study of switching 
systems in electrical engineering. Turing, who is perhaps first among 
those who ha ve studied the logical possibilities of the machine as an 
inteliectual experiment, served the British government during the 
war as a worker in electronics, and is now in charge of the program 
which the National Physical Laboratory at Teddington has under- 
taken for the development of computing machines of the modera 
type. 

Another young migrant from the field of mathematical logic to 
cybernetic8 is Walter Pitts. He had been a student of Carnap at 
Chicago and had also been in contact with Professor Rashevsky and 
his school of biophysicists. Let it be remarked in passing that this 
group has contri buted much to directing the attention of the mathe- 
matically minded to the possibilities of the biological Sciences, 
although it may seem to some of us that they are too dominated by 
problema of energy and potential and the methods of classical physics 
to do the best possible work in the study of systems like the nervous 
system, which are very far from being closed energetically, 

Mr. Pitts had the good fortune to fall under McCulloch’s influence, 
and the two began to work quite early on problems concerning the 
unión of nerve fibers by synapses into systems with given over-ali 
properties. Independently of Shannon, they had used the technique 
of mathematical logic for the discussion of what were after all 
switching problems. They added elementa which were not promi nent 
in Shannon’s earlier work, although they are certainly suggested by 
the ideas of Turing: the use of the time as a parameter, the considera- 
tion of nets containing eyeles, and of synaptic and other delays. 1 

In the summer of 1943, I met Dr. J. Lettvin of the Boston City 
Hospital, who was very much interested in matters concerning 

1 Turing, A. M., “On Computable Numbera, with an Application to the Entschei- 
dungsproblem,” Proceedinga of the London Mathematical Society, Ser. 2, 42, 230- 
265 (1936). 
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nervous mechanisms. He was a cióse friend of Mr. Pitfcs, and made 
me acquainted with his work. 1 He induced Mr. Pitts to come out to 
Boston, and to make the acquaintance of Dr. Rosenblueth and rny- 
self. We welcomed him into our group. Mr. Pitts carne to the 
Massachusetts Institute of Technology in the autumn of 1943, in 
order to work with me and to strengthen his mathematical back- 
ground for the study of the new Science of cybernetics, which had by 
that time been fairly born but not yet christened. 

At that time Mr. Pitts was already thoroughly acquainted with 
mathematical logic and neurophysiology, but had not had the chance 
to make very many engineering contacta. In particular, he was not 
acquainted with Dr. Shannon’s work, and he had not had mucb 
experience of the possibilities of electronics. He was very much 
interested when I showed him examples of modem vacuum tubes and 
explained to him that these were ideal means for realizing in the metal 
the equivalents of his neuronic circuits and systems. Frora that 
time, it became clear to us that the ultra-rapid computing machine, 
depending as it does on consecutive switching devices, must repreaent 
almost an ideal model of the problema arising in the nervous system. 
The ail-or-none character of the discharge of the neurona is precisely 
analogous to the single choice made in determining a digit on the 
binary scale, which more than one of us had already contemplated as 
the most satisfactory basis of computing-machine design. The 
synapse is nothing but a mechanism for determining whether a certain 
combination of outputs from other selected elementa will or wili not 
act as an adequate stimulus for the discharge of the nexfc element, 
and must have its precise analogue in the computing machine. The 
problem of interpreting the nature and varieties of memory in the 
animal has its parallel in the problem of constructing artificial 
memorie8 for the machine. 

At this time, the construction of computing machines had proved 
to be more essential for the war effort than the first opinión of Dr. 
Bush might have indicated, and was progressing at several centers 
along lines not too different from those which my earlier report had 
indicated. Harvard, Aberdeen Proving Ground, and the University 
of Pennsylvania were already constructing machines, and the 
Institute for Advanced Study at Princeton and the Massachusetts 
Institute of Technology were soon to enter the same field. In this 
program there was a gradual progresa from the mechanical assembly 
to the eiectrical assembly, from the scale of ten to the scale of two, 

1 McCulloch, W. S., and W. Pitts, "A logical calculus of the ideas iramanent in 
nervous activity,” Bull. Math. Biophy*, 5, 115-133 (1943). 
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from the mechanical relay to the eléctrica! relay, from humanly 
directed operation to automatically directed operation; and in 
short, each new machine more than the iast was in conformity with 
the memorándum I had sent Dr. Bush. There was a continual 
going and coming of those interested in these fields. We had an 
opportunity to communicate our ideas to our colleagues, in particular 
to Dr. Aiken of Harvard, Dr. von Neumann of the Institute for 
Advanced Study, and Dr. Goldstine of the Eniac and Edvac 
machines at the University of Pennsylvania. Everywhere we met 
with a sympathetic hearing, and the vocabulary of the engineers 
soon became contaminated with the terms of the neurophysiologist 
and the psychologist. 

At this stage of the proceedings, Dr. von Neumann and myself felt 
it desirable to hold a joint meeting of all those interested in what we 
now cali cybernetics, and this meeting took place at Princeton in the 
late winter of 1943-1944. Engineers, physiologists, and mathe- 
maticians were all represented. It was impossible to have Dr. 
Rosenblueth among us, as he had just accepted an invitation to act 
as Head of the laboratories of physiology of the Instituto Nacional 
de Cardiología in México, but Dr. McCulloch and Dr. Lorente de Nó 
of the Rockefeller Institute represented the physiologists. Dr. 
Aiken was unable to be present; however, Dr. Goldstine was one of a 
group of several computing-machine designers who participated in 
the meeting, while Dr. von Neumann, Mr. Pitts, and myself were the 
mathematicians. The physiologists gave a joint presentation of 
cybemetic problems from their point of view; similarly, the comput¬ 
ing-machine designers presented their methods and objectives. At 
the end of the meeting, it had bccome clear to all that there was a 
substantial common basis of ideas between the workers in the different 
fields, that people in each group could already use notions which had 
been better developed by the others, and that some attempt should 
be made to achieve a common vocabulary. 

A considerable period before this, the war research group conducted 
by Dr. Warren Weaver had published a document, first secret and 
later restricted, covering the work of Mr. Bigelow and myself on 
predictors and wave filters. It was found that the conditions of 
anti-aircraft fire did not justify the design of special apparatus for 
curviünear prediction, but the principies proved to be sound and 
practical, and have been used by the government for smoothing 
purposes, and in severa! fields of related work. In particular, the 
type of integral equation to which the calculus of variations problem 
reduces itself has been shown to emerge in wave-guide problems and 
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in many other problema of an applied mathematical interest. Thus 
in one way or another, the end of the war saw the ideas of prediction 
theory and of the statistical approach to communication engineering 
already familiar to a large part of the statisticians and communica¬ 
tion engineers of the United States and Great Britain. It also saw 
my government document, now out of print, and a considerable 
number of expository papers by Levinson, 1 Wallman, Daniel!, 
Phillips, and others written to fill the gap. I myself have had a long 
mathematical expository paper under way for several years to pufc 
the work I have done on permanent record, but circumstances not 
completely under my control have prevented its prompt publication. 
Finally, after a joint meeting at the American Mathematical Society 
and the Institute of Mathematical Statistics held in New York in 
the spring of 1947, and devoted to the study of stochastic processes 
from a point of view closely allied to cybernetics, I have passed on 
what I have already written of my manuscript to Professor Doob of 
the University of Illinois, to be developed in his notation and accord- 
ing to his ideas as a book for the Mathematical Surveys series of the 
American Mathematical Society. I had already developed part of 
my work in a course of lectures in the mathematics department of 
M.I.T. in the summer of 1946. Since then, my oíd student and 
collaborator, 2 Dr. Y. W. Lee, has retumed from China. He is 
giving a course on the new methods for the design of wave filters and 
similar apparatus in the M.I.T. electrical engineering department in 
the fall of 1947, and has plans to work the material of these lectures 
up into a book. At the same time, the out-of-print government 
document is to be reprinted. 3 

As I have said, Dr. Rosenblueth retumed to México about the 
beginning of 1944. In the spring of 1945, I received an invitation 
from the Mexican Mathematical Society to particípate in a meeting 
to be held in Guadalajara that June. This invitation was reinforced 
by the Comisión Instigadora y Coordinadora de la Investigación 
Científica, under the leaderehip of Dr. Manuel Sandoval Vallaría, of 
whom I have already spoken. Dr. Rosenblueth invited me to share 
some scientific research with him, and the Instituto Nacional de 
Cardiología, under its director Dr. Ignacio Chávez, extended me its 
hospitality. 

I stayed some ten weeks in México at that time. Dr. Rosenblueth 

1 Levinson, N., J. Math. and Phyaica, 25, 261-278; 26, i 10-119 (1947). 

2 Lee, Y. W. f J. Math. and Phyaica, 11, 261-278 (1932). 

3 Wiener, N., Eztrapolation, Inlerpolation, and Smoothing of Stationary Time 
Serien, Technology Press and Wiley, New York, 1949. 
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and I decided to continué a line of work which we had aiready dis- 
cussed wifch Dr. WaJter B. Cannon, who was also with Dr. Rosen- 
blueth, on a visit which unfortunateiy pro ved to be his last. This 
work had to do with the relation between, on the one hand, the tonic, 
clonic, and phasic contractions in epilepsy and, on the other hand, 
the tonic spasin, beat, and fibrillation of the heart. We felt that 
heart rauscle represented an irritable tissue as useful for the invcstiga- 
tion of conduction mechanisms as nerve tissue, and furthermore, 
that the anastomoses and decussations of the heart-muscle fibers 
presented us with a simpier phenomenon than the problem of the 
nervous synapse. We were aiso deeply gratefui to Dr. Chávez for 
his unquestioning hospitality, and, while it has never been the policy 
of the Instituto to restrict Dr. Rosenblueth to the investigation of 
the heart, we were gratefui to have an opportunity to contribute to 
its principal purpose. 

Our investigation took two directions: the study of phenomena 
of conductivity and latency in uniform conducting media of two or 
more dimensions, and the statistical study of the conducting proper- 
ties of random nets of conducting fibers. The first led us to the 
rudiments of a theory of heart flutter, the latter to a certain possible 
understanding of fibrillation. Both lines of work were developed in 
a paper, 1 published by us, and, aíthough in both cases our earlier 
results have shown the need of a considerable amount of revisión and 
of supplementation, the work on flutter is being revised by Mr. 
Oliver G. Selfridge of the Massachusetts Institute of Technology, 
while the statistical technique used in the study of heart-muscle nets 
has been extended to the treatment of neurona! nets by Mr. Walter 
Pitts, now a Fellow of the John Simón Guggenheim Foundation. 
The experimental work is being carried on by Dr. Rosenblueth with 
the aid of Dr. F. García Ramos of the Instituto Nacional de Car¬ 
diología and the Mexican Army Medical Schooi. 

At the Guadalajara meeting of the Mexican Mathematical Society, 
Dr. Rosenblueth and I presented some of our results. We had 
aiready come to the conclusión that our earlier plans of collaboration 
had shown themselves to be practicable. We were fortúnate 
enough to have a chance to present our results to a larger audience. 
In the spring of 1946, Dr. McCulloch had made arrangements with 
the Josiah Macy Foundation for the first of a series of meetings to be 
held in New York and to be devoted to the problems of feedback. 

1 Wiener, N., and A. Rosenblueth, “The Mathematical Formulation of the Problem 
of Conduction of Impulses in a NetWork of Connected Excitable Elemente, Specifically 
in Cardiac Músele,” Areh. Inat. Cardtól. Méx., 16, 205-265 (1946). 
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These meetings have fceen conducted in the traditional Macy way, 
worked oufc most efficientiy by Dr. Frank Fremont-Smith, who 
organized them on behaif of the Foundation. The idea has been to 
get together a group of modest size, not exceeding some twenty in 
number, of workers in various related fieids, and to hold them to¬ 
gether for two successive days in ali-day series of informal papera, 
discussions, and meáis together, until they had had the opportunity 
to thresh out their differences and to make progrese in thinking along 
the same linea. The nucieus of our meetings has been the group that 
had assembled in Princeton in 1944, but Drs. McCulloch and Fre¬ 
mont-Smith have rightly seen the psychological and sociological 
impücations of the subject, and have co-opted into the group a 
number of leading psychologists, sociologists, and anthropologists. 
The need of including psychologists had indeed been obvious from 
the beginning. He who studies the nervous system cannot forget the 
mind, and he who studies the mind cannot forget the nervous system. 
Much of the psychology of the past has proved to be really nothing 
more than the physiology of the organs of special sense; and the 
whole weight of the body of ideas which cybemetics is ¡ntroducing 
into psychology concems the physiology and anatomy of the highly 
specialized cortical areas connecting with these organs of special 
sense. From the beginning, we have anticipated that the problem 
of the perception of Gestalt, or of the perceptual formation of uni- 
versals, would pro ve to be of this nature. What is the mecha nism by 
which we recognize a square as a square, irrespective of its position, 
its size, and its orientation? To assist us in such matters and to 
inform them of whatever use might be made of our concepts for their 
assistance, we had among us such psychologists as Professor Klüver 
of the University of Chicago, the late Dr. Kurt Lewin of the Mas- 
sachusetts Institute of Technology, and Dr. M. Ericsson of New York. 

As to socioiogy and anthropology, it is manifest that the im- 
portance of information and communication as mechanisms of 
organization proceeds beyond the individual into the community. 
On the one hand, it is completely impossible to understand social 
communities such as those of ants without a. thorough investigation 
of their means of communication, and we were fortúnate enough to 
have the aid of Dr. Schneirla in this matter. For the similar prob¬ 
lema of human organization, we sought help from the anthropologists 
Drs. Bateson and Margaret Mead; while Dr. Morgenstem of the 
Institute for Advanced Study was our adviser in the significant 
field of social organization belonging to economic theory. His very 
important joint book on games with Dr. von Neumann, by the way, 
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represents a mosfc infceresting study of social organization from the 
point of view of methods closely related to, although distinct from, 
the subject matter of cybemeties. Dr. Lewin and others represented 
the newer work on the theory of opinión sampling and the practice 
of opinión making, and Dr. F. C. S. Northrup was interested in 
assaying the philosophical significance of our work. 

This does not purport to be a complete list of our group. We 
also enlarged the group to contain more engineers and mathematicians 
such as Bigelow and Savage, more neuroanatomists and neuro- 
physiologists such as von Bonin and Lloyd, and so on. Our first 
meeting, held in the spring of 1946, was íargely devoted to didactic 
papers by those of us who had been present at the Princeton meeting 
and to a general assessment of the importance of the field by aii 
present. It was the sense of the meeting that the ideas behind 
cybemeties were sufficiently important and interesting to those 
present to warrant a continuation of our meetings at intervals of six 
months; and that before the next full meeting, we should have a 
smail meeting for the benefit of the less mathematically trained to 
explain to them in as simple language as possible the nature of the 
mathematical concepts involved. 

In the summer of 1946, I returned to México with the support of 
the Rockefeller Foundation and the hospitality of the Instituto 
Nacional de Cardiología to continué the collaboration between Dr. 
Rosenblueth and myself. This time we decided to take a nervous 
problem directly from the topic of feedback and to see what we 
could do with it experimentally. We chose the cat as our experi¬ 
mental animal, and the quadriceps extensor femoris as the muscle to 
study. We cut the attachment of the muscle, fixed it to a lever 
under known tensión, and recorded its contractions isometrically or 
Í8otonicalJy. We also used an oscillograph to record the simul- 
taneous electrical changes in the muscle itself. We worked chiefly 
with cats, first decerebrated under ether anesthesia and later made 
spinal by a thoracic transection of the cord. In many cases, 
strychnine was used to increase the reflex responses. The muscle 
was loaded to the point where a tap would set it into a periodic 
pattern of contraction, which is called donus in the language of the 
physiologist. We observed this pattern of contraction, paying 
attention to the physiological condition of the cat, the load on the 
muscle, the frequeney of oscillation, the base level of the oscillation, 
and its amplitude. These we tried to analyze as we should analyze a 
mechanical or electrical system exhibiting the same pattern of 
hunting. We employed, for example, the methods of MacColTs 
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book on servo mechanisms. This is not the place to discuss the full 
significance of our results, which we are now repeating and preparing 
to write up for publication. However, the following statements are 
either established or very probable: that the frequency of clonic 
oscillation is much less sensitive to changes of the loading conditions 
than we had expected, and that it is much more nearly determined 
by the constants of the closed are (efferent-nerve)-muscle-(kines- 
thetic-end-body )-(afferent-nerve )-{central-synapse )-(efferent-nerve) 
than by anything else. This circuit is not even approximately a 
Circuit of linear operators if we take as our base of linearity the num- 
ber of impulses transmitted by the efferent nerve per second, but 
seems to become much more nearly so if we replace the number of 
impulses by its logarithm. This corresponds to the fact that the 
form of the envelope of stimulation of the efferent nerve is not nearly 
sinusoidal, but that the logarithm of this curve is much more nearly 
sinusoidal; while in a linear oscillating system with constant energy 
level, the form of the curve of stimulation must be sinusoidal in ali 
except a set of cases of zero probability. Again, the notions of 
facilitation and inhibition are much more nearly multiplicative than 
additive in nature. For example, a complete inhibition means a 
multiplication by zero, and a partial inhibition means a multiplica- 
tion by a small quantity. It is these notions of inhibition and 
facilitation which have been used 1 in the discussion of the reflex are. 
Furthermore, the synapse is a coincidence-recorder, and the outgoing 
fiber is stimulated only if the number of incoming impulses in a small 
summation time exceeds a certain threshold. If this threshold is 
iow enough in comparison with the full number of incoming synapses, 
the synaptic mechanism serves to muitiply probabilities, and that it 
can be even an approximately linear iink is possible only in a 
logarithmic system. This approximate logarithmicity of the synapse 
mechanism is certainly allied to the approximate logarithmicity of 
the Weber-Fechner law of sensation intensity, even though this law 
is only a first approximation. 

The most striking point is that on this logarithmic basis, and with 
data obtained from the conduction of single pulses through the 
various elementa of the neuromuscular are, we were able to obtain 
very fair approximations to the actual periods of clonic vibration, 
using the teehnique already deveioped by the servo engineers for the 
determination of the frequencies of hunting oscillations in feedback 
Systems which have broken down. We obtained theoretical oscilla- 

1 Unpublished articles on clonus from the Instituto Nacional de Cardiología, 
México. 
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tions of about 13.9 per second, in cases where the observed oscillations 
varied between frequencies of 7 and 30, but generally remained 
within a range varying somewhere between 12 and 17. Under the 
circumatances, this agreement is excellent. 

The frequency of clonus is not the only important phenomenon 
which we may observe: there is also a relatively slow change in 
basal tensión, and an even slower change in ampiitude. These 
phenomena are certainly by no means linear. However, sufficiently 
slow changes in the constante of a linear oscillating system may be 
treated to a first approximation as though they were infinitely slow, 
and as though over each part of the oscillation the system behaved as 
it would if its parametera were those belonging to it at the time. This 
is the method known in other branches of physics as that of secular 
perturbations. It may be used to study the problema of base íevel 
and ampiitude of clonus. While this work has not yet been com- 
pleted, it is olear that it is both possible and promising. There is a 
strong suggestion that though the timing of the main are in clonus 
proves it to be a two-neuron are, the ampliñeation of impulses in this 
are is variable in one and perhaps in more points, and that some part 
of this amplificaron may be affeeted by slow, multineuron processes 
which run much higher in the central nervous system than the spinal 
chain primarily responsible for the timing of clonus. This variable 
amplificaron may be affeeted by the general Ievel of central activity, 
by the use of strychnine or of anesthetics, by decerebration, and by 
many other causes. 

These were the main resulte presented by Dr. Rosenblueth and 
myself at the Macy meeting held in the autumn of 1946, and in a 
meeting of the New York Academy of Sciences held at the same time 
for the purpose of diffusing the notions of cybernetics over a larger 
public. While we were píeased with our resulta, and fully convinced 
of the general practicability of work in this direction, we felfc never- 
theless that the time of our collaboration had been too brief, and that 
our work had been done under too much pressure to make it desirable 
to publish without further experimental confirmation. This con¬ 
firmaron—which naturally might amount to a refutaron—we are 
now seeking in the summer and autumn of 1947. 

The Rockefeller Foundation had already given Dr. Rosenblueth a 
grant for the equipment of a new laboratory building at the Instituto 
Nacional de Cardiología. We felt that the time was now ripe for us 
to go jointly to them—that is, to Dr. Warren Weaver, in charge of 
the department of physical Sciences, and to Dr. Robert Morison, in 
charge of the department of medical Sciences—to establish the basis 
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of a long-time scientific collaboration, in order to carry on our 
program at a more leisurely and healthy pace. In this we were 
enthusiastically backed by our respective institutions. Dr. George 
Harrison, Dean of Science, was the chief representative of the 
Massachusetts Institute of Technology during these negotiations, 
while Dr. Ignacio Chávez spoke for his institution, the Instituto 
Nacional de Cardiología. During the negotiations, it became clear 
that the laboratory center of the joint activity should be at the 
Instituto, both in order to avoid the duplication of laboratory 
equipment and to further the very real interest the Rockefeller 
Foundation has shown in the establishment of scientific centers in 
Latín America. The plan finally adopted was for five years, during 
whieh I should spend six months of every other year at the Instituto, 
while Dr. Rosenblueth would spend six months of the intervening 
years at the Institute. The time at the Instituto is to be devoted to 
the obtaining and elucidation of experimental data pertaining to 
cybemeties, while the intermedíate years are to be devoted to more 
theoretical research and, above all, to the very difficult problem of 
devising, for people wishing to go into this new field, a scheme of 
training which will secure for them both the necessary mathematicai, 
physical, and engineering background and the proper acquaintance 
with biológica!, psychological, and medical techniques. 

In the spring of 1947, Dr. McCulloch and Mr. Pitts did a piece of 
work of considerable cybernetic importance. Dr. McCulloch had 
been given the problem of designing an apparatus to enable the blind 
to read the printed page by ear. The production of variable tones 
by type through the agency of a photocell is an oíd story, and can be 
effected by any number of methods; the difficult point is to make the 
pattem of the sound substantially the same when the pattern of the 
letters is given, whatever the size. This is a definite analogue of 
the problem of the perception of form, of Gestalt, which allows us to 
recognize a square as a square through a large number of changes of 
size and of orientation. Dr. McCulloch’s device invoíved a selective 
reading of the type imprint for a set of different magnifications. 
Such a selective reading can be performed automaticaliy as a 
scanning process. This scanning, to allow a comparison between a 
figure and a given standard figure of fixed but different size, was a 
device which I had already suggested at one of the Macy meetings. 
A diagram of the apparatus by which the selective reading was done 
carne to the attention of Dr. von Bonin, who immediately asked, “Is 
this a diagram of the fourth layer of the visual cortex of the brain? ” 
Acting on this suggestion, Dr. McCulloch, with the assistance of 
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Mr. Pitts, produced a theory tying up the anatomy and the physi- 
ology of the visual cortex, and in this theory the operation of scanning 
over a set of transformations plays an important part. This was 
presented in the spring of 1947, both at the Macy meeting and at a 
meeting of the New York Aeademy of Sciences. Finally, this 
scanning process involves a certain periodic time, which corresponde 
to what we cali the “time of sweep” in ordinary televisión. There 
are various anatomic clues to this time in the length of the chain of 
consecutive synapses necessary to run around one cycle of perform¬ 
ance. These yield a time of the order of a tenth of a second for a 
complete performance of the cycle of operations, and this is the 
approximate period of the so-called “alpha rhythm” of the braín. 
Finally, the alpha rhythm, on quite other evidence, has already been 
conjectured to be of visual origin and to be important in the process 
of form perception. 

In the spring of 1947, I received an invitation to particípate in a 
mathematical conference in Nancy on problems arising from har¬ 
monio analysis. I accepted and, on my voyage there and back, 
spent a total of three weeks in England, chiefly as a guest of my oíd 
friend Professor J. B. S. Haldane. I had an excellent chance to 
meet most of those doing work on ultra-rapid computing machines, 
especially at Manchester and at the National Physical Laboratories 
at Teddington, and above all to talk over the fundamental ideas of 
cybernetics with Mr. Turing at Teddington. I also visited the 
Psychological Laboratory at Cambridge, and had a very good chance 
to discuss the work that Professor F. C. Bartlett and his staff were 
doing on the human element in control processes involving such an 
element. I found the interest in cybernetics about as great and well 
informed in England as in the United States, and the engineering 
work excellent, though of course limited by the smaller funds 
available. I found much interest and understanding of its possibility 
in many quarters, and Professors Haldane, H. Levy, and Bemal 
certainly regarded it as one of the most urgent problems on the 
agenda of Science and scientific philosophy. I did not find, however, 
that as much progress had been made in unifying the subject and in 
pulling the various threads of research together as we had made at 
home in the States. 

In France, the meeting at Nancy on harmonio analysis contained a 
number of papera uniting statistical ideas and ideas from com- 
munication engineering in a manner wholly in conformity with 
the point of view of cybernetics. Here I must mention especially 
the ñames of M. Blanc-Lapierre and M. Loéve. I found also a 
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considerable interest in the subjeet on the part of mathematicians, 
physiologists, and physical chemists, particuíarly with regard to ifcs 
thermodynamie aspects in so far as they touch the more general 
problem of the nature of iife itself. Indeed, I had discussed that 
subjeet in Boston, before my departure, with Professor Szent- 
Gybrgyi, the Hungarian biochemist, and had found his ideas con- 
cordant with my own. 

One event during my French visit is particuíarly worth while 
noting here. My colleague, Professor G. de Santillana of M.I.T., 
introduced me to M. Freymann, of the firm of Hermann et Cíe, and 
he requested of me the present book. I am particuíarly giad to 
receive his invitation, as M. Freymann is a Mexican, and the writing 
of the present book, as well as a good deal of the research leading up 
to it, has been done in México. 

As I have already hinted, one of the directions of work which the 
realm of ideas of the Macy meetings has suggested concerns the 
importance of the notion and the technique of communication in 
the social system. It is certainly true that the social system is an 
organization like the individual, that it is bound together by a system 
of communication, and that it has a dynamics in which circular 
processes of a feedback nature play an important part. This is true, 
both in the general fields of anthropology and of sociology and in the 
more specific field of economics; and the very important work, which 
we have already mentioned, of von Neumann and Morgenstern on the 
theory of games enters into this range of ideas. On this basis, Dra. 
Gregory Bateson and Margaret Mead have urged me, in view of the 
intensely pressing nature of the socioiogical and economic problema 
of the present age of confusión, to de vote a large part of my energies 
to the discussion of this side of cybernetics. 

Much as I sympathize with their sense of the urgency of the situa- 
tion, and much as I hope that they and other competent workers will 
take up problema of this sort, which I shall discuss in a later chapter 
of this book, I can share neither their feeling that this field has the 
first claim on my attention, ñor their hopefulness that sufficient 
progresa can be registered in this direction to have an appreciable 
therapeutic effect in the present diseases of society. To begin with, 
the main quantities affecting society are not only statistical, but the 
runs of statistics on which they are based are excessively short. 
There is no great use in lumping under one head the economics of 
Steel industry before and after the introduction of the Bessemer 
procesa, ñor in comparing the statistics of rubber production before 
and after the burgeoning of the automobile industry and the cultiva- 
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tion of Hevea in Malaya. Neither is there any important point in 
running statistics of the incidence of venereal disease in a single table 
which covers both the period before and that affcer the introduction 
of sal varean, unless for the specific purpose of studying the effective- 
ness of this drug. For a good statistic of society, we need long runs 
under essentially constant conditions , just as for a good resolution of 
light we need a lens with a large aperture. The effective aperture of 
a lens is not appreciably increased by augmenting its nominal 
aperture, unless the lens is made of a material so homogeneous that the 
delay of light in different parts of the lens conforme to the proper 
designed amount by less than a small part of a wavelength. Similarly, 
the advantage of long runs of statistics under widely varying conditions 
is specious and spurious. Thus the human Sciences are very poor 
testing-grounds for a new mathematical technique: as poor as the 
statistical mechanics of a gas would be to a being of the order of size 
of a molecule, to whom the fluctuations which we ignore from a 
larger standpoint would be precisely the matters of greatest interest. 
Moreover, in the absence of reasonably safe routine numerical 
techniques, the element of the judgment of the expert in determining 
the estimates to be made of sociological, anthropological, and 
economic quantities is so great that it is no field for a newcomer who 
has not yet had the bulk of experience which goes to malee up the 
expert. I may remarle parenthetically that the modern apparatus of 
the theory of small samples, once it goes beyond the determination of 
its own specially defined parameters and becomes a method for 
positive statistical inference in new cases, does not inspire me with 
any confidence unless it is applied by a statistician by whom the 
main elementa of the dynamics of the situation are either explicitly 
known or implicitly felt. 

I have just spoken of a field in which my expectations of cybemetics 
are definitely tempered by an understanding of the limitations of the 
data which we may hope to obtain. There are two other fields where 
I ultimately hope to accomplish something practical with the aid of 
cybemetic ideas, but in which this hope must wait on further 
developments. One of these is the matter of prostheses for lost or 
paralyzed limbs. As we have seen in the discussion of Oestalt, the 
ideas of communication engineering have already been applied by 
McCulloeh to the problem of the replacement of lost senses, in the 
construction of an instrument to enable the blind to read print by 
hearing. Here the instrument suggested by McCulloeh takes o ver 
quite explicitly some of the functions not only of the eye but of the 
visual cortex. There is a manifest possibility of doing something 
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similar in the case of artificial limbs. The loss of a segment of limb 
implies not only the loss of the pureiy passive support of the missing 
segment or its valué as mechanical extensión of the stump, and the 
loss of the contractile power of its muscles, but implies as well the 
loss of all cutaneous and kinesthetic sensations originating in it. 
The first two losses are what the artificial-limbmaker now tries to 
replace. The third has so far been beyond his scope. In the case of 
a simple peg leg, this is not important: the rod that replaces the 
missing limb has no degrees of freedom of its own, and the kinesthetic 
meehanism of the stump is fully adequate to report its own position 
and velocity. This is not the case with the articulated limb with a 
mobile knee and ankle, thrown ahead by the patient with the aid 
of his remaining musculature. He has no adequate report of their 
position and motion, and this interferes with his sureness of step on 
an irregular terrain. There does not seem to be any insuperable 
difficulty in equipping the artificial joints and the solé of the artificial 
foot with strain or pressure gauges, which are to register electrically 
or otherwise, say through vibrators, on intact areas of skin. The 
present artificial limb removes some of the paralysis caused by the 
amputation but leaves the ataxis. With the use of proper receptors, 
muchof this ataxia should disappear as well, and the patient should be 
able to learn reflexes, such as those we all use in driving a car, which 
should enable him to step out with a much surer gait. What we 
have said about the leg should apply with even more forcé to the arm, 
where the figure of the manikin familiar to all readers of books of 
neurology shows that the sensory loss in an amputation of the thumb 
alone is considerably greater than the sensory loss even in a hip-joint 
amputation. 

I have made an attempt to report these considerations to the 
proper authorities, but up to now I have not becn able to accomplish 
much. I do not know whether the same ideas have already emanated 
from other sources, ñor whether they have been tried out and found 
technically impracticable. In case they have not yet received a 
thorough practical consideration, they should receive one in the 
immediate future. 

Let me now come to another point which I believe to merit 
attention. It has long been clear to me that the modern ultra-rapid 
computing machine was in principie an ideal central nervous system 
to an apparatus for automatic control; and that its input and output 
need not be in the form of numbers or diagrams but might very well 
be, respectively, the readings of artificial sense organs, such as 
photoelectric cells or thermometers, and the performance of motora or 
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solenoids. With the aid of strain gauges or similar agencies to read 
the performance of these motor organs and to report, to “ feed back,” 
to the central control system as an artificial kinesthetic sense, we are 
already in a position to construct artificial machines of almost any 
degree of elaborateness of performance. Long before Nagasaki and 
the public awareness of the atomic bomb, it had occurred to me that 
we were here in the presence of another social potentiality of unheard- 
of importance for good and for evil. The automatic factory and the 
assembly fine without human agents are only so far ahead of us as is 
limited by our willingness to put such a degree of effort into their 
engineering as was spent, for example, in the development of the 
technique of radar in the Second World War . 1 

I have said that this new development has unbounded possibilities 
for good and for evil. For one thing, it makes the metaphorical 
dominance of the machines, as imagined by Samuel Butler, a most 
immediate and non-metaphorical problem. It gives the human race 
a new and most effective collection of mechanical slaves to perform 
its labor. Such mechanical labor has most of the economic properties 
of slave labor, although, unlike slave labor, it does not involve the 
direct demoraüzing effects of human cruelty. However, any labor 
that accepts the conditions of competition with slave labor accepts 
the conditions of slave labor, and is essentially slave labor. The key 
word of this statement is competition. It may very well be a good 
thing for humanity to have the machine remove from it the need of 
menial and disagreeable tasks, cr it may not. I do not know. It 
cannot be good for these new potentialities to be assessed in the terms 
of the market, of the money they save; and it is precisely the terms 
of the open market, the “fifth freedom,” that have become the 
shibboleth of the sector of American opinión represented by the 
National Association of Manufacturers and the Saturday Evening 
Post. I say American opinión, for as an American, I know it best, 
but the hucksters recognize no national boundary. 

Perhaps I may clarify the historical background of the present 
situation if I say that the first industrial revolution, the revolution 
of the “dark satanic milis,” was the devaluation of the human arm 
by the competition of machinery. There is no rate of pay at which a 
United States pick-and-shovel laborer can live which is low enough 
to compete with the work of a steam shoveí as an excavator. The 
modern industrial revolution is similarly bound to devalúe the 
human brain, at least in its simpler and more routine decisions. Of 
eourse, just as the skiiled carpenter, the skilled mechanic, the skilled 

1 Fortune, 32, 139—147 (October); 163-169 {November, 1945). 
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dressmaker have in so me degree survived the first industrial revo- 
lution, so the skilied scientist and the skilied administrator may 
survive the second. However, taking the second revolution as 
aecomplished, the average human being of mediocre attainments or 
less has nothing to sell that it is worth anyone’s money to buy. 

The answer, of course, is to have a society based on human valúes 
other than buying or selling. To arrive at this society, we need a 
good deal of planning and a good deal of struggle, which, if the best 
comes to the best, may be on the plañe of ideas, and otherwise—who 
knows? I thus felt it my duty to pass on my Information and 
understanding of the position to those who have an active interest in 
the conditions and the future of labor, that is, to the labor unions. I 
did manage to make contact with one or two persona high up in the 
C.I.O., and from them I received a very intelligent and sympathetic 
hearing. Further than these individuáis, neither I ñor any of them 
was able to go. It was their opinión, as it had been my previous 
observation and information, both in the United States and in 
England, that the labor unions and the labor movement are in the 
hands of a highly limited personnel, thoroughly well trained in the 
specialized problems of shop stewardship and disputes concerning 
wages and conditions of work, and totally unprepared to enter into 
the larger political, technical, sociological, and economic questions 
which concern the very existence of labor. The reasons for this are 
easy enough to see: the labor unión official generally comes from the 
exacting life of a workman into the exacting life of an administrator 
without any opportunity for a broader training; and for those who 
have this training, a unión career is not generally inviting; ñor, quite 
naturally, are the unions receptive to such people. 

Those of us who have contributed to the new Science of cybernetics 
thus stand in a moral position which is, to say the least, not very 
comfortable. We have contributed to the initiation of a new Science 
which, as I have said, embraces technical developments with great 
possibilities for good and for evil. We can only hand it over into the 
worid that exista about us, and this is the world of Belsen and 
Hiroshima. We do not even have the choice of suppressing these 
new technical developments. They beiong to the age, and the most 
any of us can do by suppression is to put the development of the 
subject into the hands of the most irresponsible and most venal of our 
engineers. The best we can do is to see that a large pubíic under- 
stands the trend and the bearing of the present work, and to confine 
our personal efforts to those fields, such as physiology and psychology, 
most remóte from war and exploitation. As we have seen, there are 
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those who hope thafc the good of a better understanding of man and 
society which is offered by this new field of work may anticípate and 
outweigh the incidental contribution we are making to the concentra- 
tion of power (which is aiways concentrated, by its very conditions of 
existence, in the hands of the most unscrupulous). I write in 1947, 
and I am compelled to say that it is a very slight hope. 

The author wishes to express his gratitude to Mr. Walter Pitts, 
Mr. Oliver Selfridge, Mr. Georges Dubé, and Mr. Frederic Webster for 
aid in correcting the manuscript and preparing the material for 
publication. 

Instituto Nacional de Cardiología, 
Ciudad de México 

November, 1947 
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Newtonian and Bergsonian Time 


There is a little hymn or song familiar to every Germán child. 
It goes: 

,, Weisst du, wieviel Sternlein stehen 
An dem blauen Himmelszelt? 

Weisst du, wieviel Wolken gehen 
Weithin über alie Welt? 

Gott, der Herr, hat sie gezáhlet 
Dass ihm auch nicht eines fehlet 
An der ganzen, grossen Zahl.“ 

W. Hey 

In English this says: “Knowest thou how many stars stand in the 
blue fcent of heaven? Knowest thou how many clouds pass far over 
the whole world? The Lord God hath counted them, that not one of 
the whole great number be lacking.” 

This little song is an interesting theme for the philosopher and the 
historian of Science, in that it puts side by side two Sciences which 
have the one similarity of dealing with the heavens above us, but 
which in almost every other respect offer an extreme contrast. 
Astronomy is the oldest of the Sciences, while meteorology is among 
the youngest to begin to deserve the ñame. The more familiar 
astronomical phenomena can be predicted for many centuries, while 
a precise prediction of tomorrow’s weather is generally not easy and 
in many places very crude indeed. 

To go back to the poem, the answer to the first question is that, 
within limits, we do know how many stars there are. In the first 
place, apart from minor uncertainties concerning some of the double 
30 
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and variable stars, a star is a definite object, eminently suitable for 
eounting and cataloguing; and if a human Durchmusierung of the 
stars—as we cali these catalogues—stops short for stars less intense 
than a certain magnitude, there is nothing too repugnant to us in the 
idea of a divine Durchmusterung going much further. 

On the other hand, if you were to ask the meteorologist to give 
you a similar Durchmusterung of the clouds, he might íaugh in your 
face, or he might patiently explain that in all the language of 
meteorology there is no such thing as a cloud, defined as an object 
with a quasi-permanent identity; and that if there were, he neither 
possesses the facilities to count them, ñor is he in fact interested in 
eounting them. A topologically inclined meteorologist might 
perhaps define a cloud as a connected región of space in wbich the 
density of the part of the water content in the solid or liquid state 
exceeds a certain amount, but this definition would not be of the 
slightest valué to anyone, and would at most represent an extremely 
transitory state. What really concems the meteorologist is some 
such statistical statement as, “Boston: January 17, 1960: Sky 38% 
overcast: Cirrocumulus.” 

There is of course a branch of astronomy which deais with what 
may be called cosmic meteorology: the study of galaxies and nebulae 
and star clusters and their statistics, as pursued for example by 
Chandrasekhar, but this is a very young branch of astronomy, 
younger than meteorology itself, and is something outside the 
tradition of classical astronomy. This tradition, apart from its 
purely classificatory, Durchmusterung aspeets, was originally con- 
cemed rather with the solar system than with the world of the fixed 
stars. It is the astronomy of the solar system which is that chiefly 
associated with the ñames of Copernicus, Kepler, Galileo, and New- 
ton, and which was the wet nurse of modem physics. 

It is indeed an ideally simple Science. Even before the existence 
of any adequate dynamical theory, even as far back as the Baby- 
lonians, it was realized that eclipses occurred in regular predictable 
eyeles, extending backward and forward o ver time. It was realized 
that time itself could better be measured by the motion of the stars 
in their courses than in any other way. The pattem for all events in 
the solar system was the revolution of a wheel or a series of wheels, 
whether in the form of the Ptolemaic theory of epieyeles or the 
Copemican theory of orbits, and in any such theory the future after 
a fashion repeats the past. The music of the spheres is a palindrome, 
and the book of astronomy reads the same backward as forward. 
There is no difference save of initial positions and directions between 
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the motion of an orrery turaed forward and one run in reverse. 
Finaliy, when all this was reduced by Newton to a formal set of 
posfculates and a closed mechanies, the fundamental laws of this 
mechanics were unaitered by the transformation of the time variable 
t into its negative. 

Thus if we were to take a motion picture of the planeta, speeded 
up to show a perceptible picture of activity, and were to run the film 
baekward, it would still be a possible picture of planeta conforming 
to the Newtonian mechanics. On the other hand, if we were to take 
a motion-picture photograph of the turbulence of the clouds in a 
thunderhead and reverse it, it would look altogether wrong. We 
should see downdrafts where we expect updrafts, turbulence growing 
coaraer in texture, lightning preceding instead offollowing thechanges 
of cloud which usually precede it, and so on indefinitely. 

What is the difference between the aatronomical and the meteor- 
ological situation which brings about all these differencea, and in 
particular the difference between the apparent reversibility of 
aatronomical time and the apparent irreversibility of meteorological 
time? In the first place, the meteorological ayatem is one involving 
a vast number of approximately equal partióles, some of them very 
closeiy coupled to one another, while the aatronomical syatem of the 
solar universe contains only a relatively amall number of partióles, 
greatly diverse in size and coupled with one another in a sufficiently 
loose way that the second-order coupling effecta do not change the 
general aspect of the picture we observe, and the very high order 
coupling effects are completely negligible. The planeta move under 
conditions more favorable to the isolation of a certain limited set of 
forces than those of any physical experiment we can set up in the 
laboratory. Compared with the distances between them, the planeta, 
and even the aun, are very nearly points. Compared with the elastic 
and plástic deformations they suffer, the planeta are either very nearly 
rigid bodies, or, where they are not, their internal forcea are at any 
rate of a relatively slight significance where the relative motion of 
their centers is concerned. The space in which they move is almost 
perfectly free from impeding matter; and in their mutual attraction, 
their massea may be considered to lie very nearly at their centers and 
to be constant. The departure of the law of gravity from the 
inverse square law is most minute. The positions, velocities, and 
masses of the bodies of the solar syatem are extremely well known at 
any time, and the computation of their future and past positions, 
while not easy in detail, is easy and precise in principie. On the 
other hand, in meteoroíogy, the number of partióles concerned is so 
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enormous that an accurate record of their initial positions and 
velocities is utterly impossible; and if this record were actually made 
and their future poaitions and velocities computed, we should ha ve 
nothing but an impenetrable mass of figures which would need a 
radical reinterpretation before it could be of any Service to us. The 
terms “cloud,” “temperatura,” “turbulence,” etc., are aíl terms 
referring not to one single physical situation but to a distribution of 
possible situations of which only one actual case is realized. If all 
the readings of all the meteorological stations on earth were simul- 
taneously taken, they would not give a billionth part of the data 
necessary to characterize the actual state of the atmosphere from a 
Newtonian point of view. They would only give certain constants 
consistent with an infinity of different atmospheres, and at most, 
together with certain a priori assumptions, capable of giving, as a 
probability distribution, a measure over the set of possible atmos¬ 
pheres. Using the Newtonian laws, or any other system of causal 
laws whatever, all that we can predict at any future time is a 
probability distribution of the constants of the system, and even 
this predictability fades out with the increase of time. 

Now, even in a Newtonian system, in which time is perfectly 
reversible, questions of probability and prediction lead to answers 
asymmetrical as between past and future, because the questions to 
which they are answers are asymmetrical. If I set up a physical 
experiment, I bring the system I am considering from the past into 
the present in such a way that I fix certain quantities and have a 
reasonable right to assume that certain other quantities have known 
statistical distributions. I then observe the statistical distribution of 
results after a given time. This is not a process which I can reverse. 
In order to do so, it would be necessary to pick out a fair distribution 
of systems which, without intervention on our part, would end up 
within certain statistical limita, and find out what the antecedent 
conditions were a given time ago. However, for a system starting 
from an unknown position to end up in any tightly defined statistical 
range is so rare an occurrence that we may regard it as a miracle, 
and we cannot base our experimental technique on awaiting and 
counting miracles. In short, we are directed in time, and our 
relation to the future is different from our relation to the past. All 
our questions are conditioned by this asymmetry, and all our answers 
to these questions are equally conditioned by it. 

A very interesting astronomical question conceming the direction 
of time comes up in connection with the time of astrophysics, 
in which we are observing remóte heavenly bodies in a single 
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observation, and in which there seems to be no unidirectionalness 
in the nafcure of our experimenfc. Why then does the unidirectional 
thermodynamics which is based on experimental terrestrial observa- 
tions stand us in such good stead in astrophysics? The answer is 
interesting and not too obvious. Our observations of the stars are 
through the agency of light, of rays or partióles emerging from the 
observed object and perceived by us. We can perceive incoming 
light, but can not perceive outgoing light, or at least the perception 
of outgoing light is not achieved by an experiment as simple and 
direct as that of incoming light. In the perception of incoming 
light, we end up with the eye or a photographic píate. We condition 
these for the reception of images by putting them in a State of 
insulation for some time past: we dark-condition the eye to avoid 
after-images, and we wrap our plates in black paper to prevent 
halation. It is olear that only such an eye and only such plates are 
any use to us: if we were given to pre-images, we might as weil be 
blind; and if we had to put our plates in black paper after we use 
them and develop them before using, photography would be a very 
difficult art indeed. This being the case, we can see those stars 
radiating to us and to the whole world; while if there are any stars 
whose evolution is in the reverse direction, they will attract radiation 
from the whole heavens, and even this attraction from us will not be 
perceptible in any way, in view of the fact that we already know our 
own past but not our future. Thus the part of the universe which we 
see must have its past-future relations, as far as the emission of 
radiation is concemed, concordant with our own. The very fact 
that we see a star means that its thermodynamics is like our own. 

Indeed, it is a very interesting intellectual experiment to make the 
fantasy of an intelligent being whose time shouid run the other way 
to our own. To such a being, all communication with us would be 
impossible. Any signal he might send would reach us with a logical 
stream of consequents from his point of view, antecedents from 
ours. These antecedents would already be in our experience, and 
would have served to us as the natural explanation of his signal, 
without presupposing an intelligent being to have sent it. If he 
drew us a square, we shouid see the remains of his figure as its pre¬ 
cursora, and it would seem to be the curious crystallization—always 
perfectly explainabie—of these remains. Its meaning would seem to 
be as fortuitous as the faces we read into mountains and cliffs. The 
drawing of the square would appear to us as a catastrophe—sudden 
indeed, but explainabie by natural laws—by which that square 
would cease to exist. Our counterpart would have exactly similar 



NEWTONIAN AND BERGSONIAN TIME 


35 


ideas conceming us. Witkin any world ivitk which toe can communi- 
caie, the direction of time is uniform. 

To refcurn to the contrast between Newtonian astronomy and 
meteoroiogy: most Sciences lie in an intermedíate position, but most 
are rather nearer to meteoroiogy than to astronomy. Even astron¬ 
omy, as we have seen, contains a cosmic meteoroiogy. Xt contains 
as well that extremely interesting fíeld studied by Sir George Darwin, 
and known as the theory of tidal evolution. We have said that we 
can treat the relative movements of the sun and the planeta as the 
movements of rigid bodies, but this is not quite the case. The earth, 
for example, is nearly surrounded by oceans. The water nearer the 
moon than the center of the earth is more strongly attracted to the 
moon than the solid part of the earth, and the water on the other side 
is less strongly attracted. This relatively slight effect pulís the 
water into two hills, one under the moon and one opposite to the 
moon. In a perfectly liquid sphere, these hills could follow the moon 
around the earth with no great dispersal of energy, and consequently 
would remain almost precisely under the moon and opposite to the 
moon. They would consequently have a pulí on the moon which 
would not greatly influence the angular position of the moon in the 
heavens. However, the tidal wave they produce on the earth gets 
tangled up and delayed on coasts and in shallow seas such as the 
Bering Sea and the Irish Sea. It consequently lags behind the 
position of the moon, and the forces producing this are largely 
turbulent, dissipative forces, of a character much like the forces met 
in meteoroiogy, and need a statistical treatment. Indeed, ocean- 
ography may be called the meteoroiogy of the hydrosphere rather 
than of the atmosphere. 

These frictional forces drag the moon back in its course about the 
earth and accelerate the rotation of the earth forward. They tend 
to bring the lengths of the month and of the day ever closer to one 
another. Indeed, the day of the moon is the month, and the moon 
always presenta nearly the same face to the earth. It has been 
suggested that this is the resuit of an ancient tidal evolution, when 
the moon contained some liquid or gas or plástic material which 
could give under the earth’s attraction, and in so giving could dissipate 
large amounts of energy. This phenomenon of tidal evolution is not 
confined to the earth and the moon but may be observed to some 
degree throughout all gravitating systems. In ages past it has 
seriously modified the face of the solar system, though in anything 
like historie times this modification is slight compared with the 
“rigid-body” motion of the planets of the solar system. 
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Thus even gravitationai astronomy involves frictional processes 
that run down. There is not a single Science which conforme 
precisely to the strict Newtonian pattem. The biological Sciences 
certainly have their full share of one-way phenomena. Birth is nofc 
the exact reverse of death, ñor is anabolism—the building up of 
tissues—the exact reverse of catabolism—their breaking down. The 
división of celis does not follow a pattem symmetrical in time, ñor 
does the unión of the germ cells to form the fertiiized ovum. The 
individual is an arrow pointed through time in one way, and the race 
is equally directed from the past into the future. 

The record of paleontology indicates a definite long-time trend, 
interrupted and complicated though it might be, from the simple to 
the cojnplex. By the middle of the last century this trend had 
become apparent to all scientists with an honestly open mind, and it 
is no accident that the problem of discovering its mechanisms was 
earried ahead through the same great step by two men working at 
about the same time: Charles Darwin and Alfred Wallace. This step 
was the realization that a mere fortuitous variation of the individuáis 
of a species might be carved into the form of a more or less one- 
directional or few-directional progresa for each line by the varying 
degrees of viability of the several variations, either from the point of 
view of the individual or of the race. A mutant dog without legs 
will certainly starve, while a long thin lizard that has developed the 
mechanism of crawling on its ribs may have a better chance for 
survival if it has clean linés and is freed from the impeding projec* 
tions of limbs. An aquatic animal, whether fish, lizard, or mammal, 
will swim better with a fusiform shape, powerful body muscles, and 
a posterior appendage which will catch the water; and if it is 
dependent for its food on the pursuit of swift prey, its chances of 
survival may depend on its assuming this form. 

Darwinian evolution is thus a mechanism by which a more or less 
fortuitous variability is combined into a rather definite pattern. 
Darwin’s principie still holds today, though we have a much better 
knowledge of the mechanism on which it depends. The work of 
Mendel has given us a far more precise and discontinuous view of 
heredity than that held by Darwin, while the notion of mutation, 
from the time of de Vries on, haB compietely altered our conception 
of the statistical basis of mutation. We have studied the fine 
anatomy of the chromosome and have localized the gene on it. The 
list of modern geneticists is long and distinguished. Several of 
these, such as Haldane, have made the statistical study of Mendeiian- 
ism an effeetive tool for the study of evolution. 
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We have already spoken of the tidal evolution of Sir George 
Darwin, Charles Darwin’s son. Neither the connection of the idea of 
the son with that of the father ñor the choice of the ñame “evolu¬ 
tion ” is fortuitous. In tidal evolution as well as in the origin of 
speeies. we have a mechanism by means of which a fortuitous 
variabiiity, that of the random motions of the waves in a tidal sea 
and of the molecules of the water, is con verted by a dynamical procesa 
into a pattern of deveiopment which reads in one direction. The 
theory of tidal evolution is quite definitely an astronomical applica- 
tion of the eider Darwin. 

The third of the dynasty of Darwins, Sir Charles, is one of the 
authorities on modern quantum mechanics. This fact may be 
fortuitous, but it nevertheless represents an even further invasión of 
Newtonian ideas by ideas of statistics. The succession of ñames 
Maxwell-Boltzmann-Gibbs represents a progressive reduction of 
thermodynamics to statistical mechanics: that is, a reduction of the 
phenomena conceming heat and temperature to phenomena in 
which a Newtonian mechanics is applied to a situation in which we 
deal not with a single dynamical system but with a statistical distri- 
bution of dynamical systems; and in which our conclusions concern 
not all such systems but an overwhelming majority of them. About 
theyear 1900, it became apparent that there was something seriously 
wrong with thermodynamics, particularly where it concerned 
radiation. The ether showed much less power to absorb radiations 
of high frequency—as shown by the law of Planck—than any existing 
mechanization of radiation theory had allowed. Planck gave a 
quasi-atomic theory of radiation—the quantum theory—which 
accounted satisfactoriiy enough for these phenomena, but which 
was at odds with the whole remainder of physics; and Niela Bohr 
followed this up with a similarly ad hoc theory of the atom. Thus 
Newton and Planck-Bohr formed, respectively, the thesis and 
antithesis of a Hegelian antinomy. The synthesis is the statistical 
theory discovered by Heisenberg in 1925, in which the statistical 
Newtonian dynamics of Gibbs is replaced by a statistical theory very 
similar to that of Newton and Gibbs for large-scale phenomena, but 
in which the complete collection of data for the present and the past 
is not sufficient to predict the future more than statistically. It is 
thus not too much to say that not oníy the Newtonian astronomy 
but even the Newtonian physics has become a picture of the average 
results of a statistical situation, and henee an account of an evolution- 
ary procese. 

This transition from a Newtonian, reversible time to a Gibbsian, 
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irreversible time has, had its philosophical echoes. Bergson empha- 
sized the difference between the reversible time of physics, in which 
nothing new happens, and the irreversible time of evolution and 
biology, in which there is always something new. The realization 
that the Newtonian physics was not the proper frame for biology 
was perhaps the central point in the oíd controversy between vitalism 
and mechanism; aithough this was complicated by the desire to 
conserve in some form or other at least the shadows of the souí and 
of God against the inroads of materialism. In the end, as we have 
seen, the vitalist pro ved too much. Instead of buiíding a wall 
between the claims of life and those of physics, the wall has been 
erected to surround so wide a compass that both matter and life 
find themselves inside it. It is true that the matter of the newer 
physics is not the matter of Newton, but it is something quite as 
remóte from the anthropomorphizing desires of the vitalists. The 
chance of the quantum theoretician is not the ethical freedom of the 
Augustinian, and Tyche is as relentiesa a mistress as Ananke. 

The thought of every age is reflected in its technique. The civil 
engineers of ancient days were land surveyors, astronomers, and 
navigators; those of the seventeenth and early eighteenth centuries 
were clockmakers and grinders of lenses. As in ancient times, the 
craftsmen made their tools in the image of the heavens. A watch is 
nothing but a pocket orrery, moving by necessity as do the celestial 
spheres; and if friction and the dissipation of energy play a role in 
it, they are effects to be overeóme, so that the resulting motion of 
the hands may be as periodic and regular as possible. The chief 
technical result of this engineering after the model of Huyghens and 
Newton was the age of navigation, in which for the first time it was 
possible to compute longitudes with a respectable precisión, and to 
convert the commerce of the great oceans from a thing of chance and 
adventure to a regular understood business. It is the engineering 
of the mercantilists. 

To the merchant succeeded the manufacturer, and to the chronom- 
eter, the steam engine. From the Newcomen engine almost to the 
present time, the central field of engineering has been the study of 
prime movers. Heat has been converted into usable energy of 
rotation and translation, and the physics of Newton has been 
supplemented by that of Rumford, Carnot, and Joule. Thermo- 
dynamics makes its appearance, a Science in which time is eminently 
irreversible; and aithough the earlier stages of this Science seem to 
represent a región of thought almost without contact with the New¬ 
tonian dynamics, the theory of the conservation of energy and the 
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later statistical explanation of the Carnot principie or second law of 
fchermodynamics or principie of the degradafcion of energy—that 
principie which malees the máximum efficiency obtainable by a 
steam engine depend on the working temperatures of the boiler 
and the condenser—all these have fused thermodynamics and the 
Newtonian dynamics into the statistical and the non-statistical 
aspeets of the same Science. 

If the seventeenth and early eighteenth centuries are the age of 
docks, and the later eighteenth and the nineteenth centuries constitute 
the age of steam engines, the present time is the age of communica- 
tion and control. There is in eléctrica! engineering a split which is 
known in Germany as the split between the technique of strong 
currents and the technique of weak currents, and which we know as 
the distinction between power and communication engineering. It 
is this split which separates the age just past from that in which we 
are now living. Actually, communication engineering can deal 
with currents of any size whatever and with the movement of engines 
powerful enough to swing massive gun turrets; what distinguishes it 
from power engineering is that its main interest is not economy of 
energy but the accurate reproduction of a signal. This signal may 
be the tap of a key, to be reproduced as the tap of a telegraph receiver 
at the other end; or it may be a sound transmitted and received 
through the apparatus of a telephone; or it may be the turn of a 
ship’s wheel, received as the angular position of the rudder. Thus 
communication engineering began with Gauss, Wheatstone, and the 
first telegraphers. It received its first reasonably scientific treat- 
ment at the hands of Lord Kelvin, after the failure of the first 
transatlantic cable in the middle of the last century; and from the 
eighties on, it was perhaps Heaviside who did the most to bring it 
into a modern shape. The discovery of radar and its use in the 
Second World War, together with the exigencies of the control of 
anti-aircraft fire, have brought to the field a large number of well- 
trained mathematicians and physicists. The wonders of the auto- 
matic computing machine belong to the same realm of ideas, which 
was certainly never so actively pursued in the past as it is at the 
present day. 

At every stage of technique since Daedalus or Hero of Alexandria, 
the ability of the artificer to produce a working simulacrum of a 
living organism has always intrigued people. This desire to produce 
and to study autómata has always been expressed in terms of the 
living technique of the age. In the days of magic, we have the 
bizarre and sinister concept of the Goíem, that figure of clay into 
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which the Rabbi of Prague breathed life with the blasphemy of 
the Ineffable Ñame of God. In the time of Newton, the automaton 
becomes the clockwork music box, with the littie effigies pirouetting 
stiffly on top. In the nineteenth century, the automaton is a 
glorified heat engine, burning some combustible fuel instead of the 
glycogen of the human muscles. Finally, the present automaton 
opens doors by means of photocells, or points guns to the place at 
which a radar beam picks up an airplane, or computes the solution 
of a diíferential equation. 

Neither the Greek ñor the magical automaton lies along the main 
lines of the direction of deveíopment of the modem machine, ñor do 
they seem to have had much of an influence on serious philosophic 
thought. It is far different with the clockwork automaton. This 
idea has played a very genuine and important role in the early 
history of modem philosophy, although we are rather prone to ignore 
it. 

To begin with, Descartes considera the lower anímala as autómata. 
This is done to avoid questioning the orthodox Christian attitude that 
animáis have no souls to be saved or damned. Just how these living 
autómata function is something that Descartes, so far as I know, 
never discusses. However, the important allied question of the 
mode of coupling of the human soul, both in sensation and in will, 
with its material environment is one which Descartes does discuss, 
although in a very unsatisfactory manner. He places this coupling 
in the one median part of the brain known to him, the pineal gland. 
As to the nature of his coupling—whether or not it representa a 
direct action of mind on matter and of matter on mind—he is none 
too clear. He probably does regard it as a direct action in both 
ways, but he attributes the validity of human experience in its action 
on the outside world to the goodness and honesty of God. 

The role attributed to God in this matter is unstable. Either God 
is entirely passive, in which case it is hard to see how Descartes’ 
explanation really explains anything, or He is an active participant, 
in which case it is hard to see how the guarantee given by His 
honesty can be anything but an active participation in the act of 
sensation. Thus the causal chain of material phenomena is paralleled 
by a causal chain starting with the act of God, by which He produces 
in us the experiences corresponding to a given material situation. 
Once this is assumed, it is entirely natural to attribute the corre- 
spondence between our will and the effects it seems to produce in 
the external world to a similar divine intervention. This is the path 
followed by the Occasionaiists, Geulincx and Malebranche. In 
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Spinoza, who is in many ways the continuafcor of this school, the 
doctrine of Occasionalism assumes the more reasonable form of 
asserting that the correspondence between mind and matter is that 
of two self-contained attributes of God; but Spinoza is not dynam- 
ically minded, and gives iittle or no attention to the mechanism of 
this correspondence. 

This is the situation from which Leibniz starts, but Leibniz is as 
dynamically minded as Spinoza is geometricalíy minded. First, he 
replaces the pair of corresponding elements, mind and matter, by a 
continuum of corresponding elements: the monada. While these are 
conceived after the pattern of the soul, they inelude many instances 
which do not rise to the degree of self-consciousness of full souls, and 
which form part of that world which Descartes would have attributed 
to matter. Each of them lives in its own ciosed universe, with a 
perfect causal chain from the creation or from minus infinity in time 
to the indefinitely remóte fu ture; but ciosed though they are, they 
correspond one to the other through the pre-established harmony of 
God. Leibniz compares them to docks which have so been wound 
up as to keep time together from the creation for all etemity. 
Unlike humanly made docks, they do not drift into asynchronism; 
but this is due to the miraculously perfect workmanship of the 
Creator. 

Thus Leibniz considere a world of autómata, which, as is natural in 
a disciple of Huyghens, he constructs after the model of clockwork. 
Though the monads reflect one another, the reflection does not consist 
in a transfer of the causal chain from one to another. They are 
actually as self-contained as, or rather more self-contained than, the 
passively dancing figures on top of a music box. They have no real 
influence on the outside world, ñor are they effectively influenced by 
it. As he says, they have no Windows. The apparent organization 
of the world we see is something between a figment and a miracle. 
The monad is a Newtonian solar system writ small. 

In the nineteenth century, the autómata which are humanly 
constructed and those other natural autómata, the animáis and 
plants of the materialist, are studied from a very different aspect. 
The conservation and the degradation of energy are the ruling 
principies of the day. The living organism is above all a heat engine, 
burning glucose or glycogen or starch, fats, and proteins into 
carbón dioxide, water, and urea. It is the metabolic balance which 
is the center of attention; and if the low working temperatures of 
animal muscle attract attention as opposed to the high working 
temperatures of a heat engine of similar efficiency, this fact is pushed 
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into a comer and glibly explained by a contrast between the Chemical 
energy of the living organism and the thermal energy of the heafc 
engine. All the fundamental notions are those associated with 
energy, and the chief of these is that of poten ti al. The engineering 
of the body is a branch of power engineering. Even today, this is the 
predominating point of view of the more classicaily minded, con¬ 
servativo physiologists; and the whole trend of thought of such 
biophysicists as Rashevsky and his school bears witness to its 
continued potency. 

Today we are coming to realize that the body is very far from a 
conservative system, and that its component parta work in an 
environment where the available power is much less limited than 
we have taken ifc fco be. The electronic tube has shown us that a 
system with an outside source of energy, almost all of which is 
wasted, may be a very effective agency for performing desired 
operations, especially if it is worked at a low energy level. We are 
beginning to see that such important elements as the neurona, the 
atoms of the nervous complex of our body, do their work under much 
the same conditions as vacuum tubes, with their relatively small 
power supplied from outside by the circulation, and that the book- 
keeping which is most essential to describe their function is not one 
of energy. In short, the newer study of autómata, whether in the 
metal or in the flesh, is a branch of communication engineering, and 
its cardinal notions are those of message, amount of disturbance or 
“noise”—a term taken over from the telephone engineer—quantity 
of information, coding technique, and so on. 

In such a theory, we deal with autómata effectively coupled to the 
extemal world, not merely by their energy flow, their metabolism, 
but also by a flow of impressions, of incoming messages, and of the 
actions of outgoing messages. The organs by which impressions are 
received are the equivalents of the human and animal sense organs. 
They comprise photoelectric celis and other receptora for light; 
radar systems, receiving their own short Hertzian waves; hydrogen- 
ion-potential recorders, which may be said to taste; thermometere; 
pressure gauges of various sorts; microphones; and so on. The 
effectors may be electrical motora or solenoids or heating coils or 
other instruments of very di verse sorts. Between the receptor or 
sense organ and the effector stands an intermedíate set of elements, 
whose function is to recombine the incoming impressions into such 
form as to produce a desired type of response in the effectors. The 
information fed into this central control system will very often 
con tai n information concerning the funetioning of the effectors 
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themselves. These correspond among other things to the kinesthetic 
organs and other proprioceptors of the human system, for we too 
have organs whieh record the position of a joint or the rate of con- 
traetion of a muscle, etc. Moreover, the information received by 
the automaton need not be used at once but may be delayed or 
stored so as to become available at some future time. This is the 
analogue of memory. Finally, as long as the automaton is running, 
its very rules of operation are susceptible to some change on the basis 
of the data which have passed through its receptora in the past, and 
this is not unlike the procese of leaming. 

The machines of which we are now speaking are not the dream of 
the sensationalist ñor the hope of some future time. They aiready 
exist as thermostats, automatic gyrocompass ship-steering systems, 
self-propelled missiles—especially such as seek their target—anti- 
aircraft fire-control systems, automaticaliy controlled oil-cracking 
stills, ultra-rapid computing machines, and the like. They had 
begun to be used long before the war—indeed, the very oíd steam- 
engine governor belongs among them—but the great mechanization 
of the Second World War brought them into their own, and the need 
of handling the extremely dangerous energy of the atom will probably 
bring them to a still higher point of development. Scarcely a month 
passes but a new book appeara on these so-called control mechanisms, 
or servomechanisms, and the present age is as truly the age of 
servomechanisms as the nineteenth century was the age of the steam 
engine or the eighteenth century the age of the dock. 

To sum up: the many autómata of the present age are coupled to 
the outside world both for the reception of impressions and for the 
performance of actions. They contain sense organs, effectors, and 
the equivalent of a nervous system to intégrate the transfer of 
information from the one to the other. They lend themselves very 
well to description in physiological terms. It is scarcely a miracle 
that they can be subsumed under one theory with the mechanisms of 
physiology. 

The relation of these mechanisms to time demands careful study. 
It is clear, of couree, that the relation input-output is a consecutive 
one in time and involves a definite past-future order. What is 
perhaps not so clear is that the theory of the sensitive autómata is a 
statistical one. We are scarcely ever interested in the performance 
of a communication-engineering machine for a single input. To 
function adequately, it must give a satisfactory performance for a 
whole class of inputs, and this means a statisticaíly satisfactory 
performance for the class of input which it is statisticaíly expected to 
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receive. Thus its theory belongs to the Gibbsian statistical mechanics 
rather than to the classical Newtonian mechanics. We shall study 
this in much more detail in the chapter devoted to the theory of 
communication. 

Thus the modern automaton exists in the same sort of Bergsonian 
time as the living organism; and henee there is no reason in Bergson’s 
considerations why the essential mode of funetioning of the living 
organism should not be the same as that of the automaton of this 
type. Vitalism has won to the extent that even mechanisms 
correspond to the time-structure of vitalism; but as we have said, 
this victory is a complete defeat, for from every point of view which 
has the slightest relation to morality or religión, the new mechanics 
is fully as mechanistic as the oíd. Whether we should cali the new 
point of view materialistic is largely a question of words: the 
ascendancy of matter characterizes a phase of nineteenth-century 
physics far more than the present age, and “ materialism ” has come 
to be but little more than a loose synonym for “mechanism.” In 
fact, the whole mechanist-vitalist controversy has been relegated to 
the limbo of badly posed questions. 



II 


Groups and Statistical Mechanics 


At about the beginning of the presenfc century, fcwo scientists, one 
in the United States and one in France, were working along Unes 
which would have seemed to each of them entirely unreiated, if 
either had had the remotest idea of the existence of the other. In 
New Haven, Willard Gibbs was developing his new point of view in 
statistical mechanics. In París, Henri Lebesgue was rivalling the 
fame of his master Emile Borel by the discovery of a revised and 
more powerful theory of integration for use in the study of trigono- 
metric series. The two discoverers were alike in this, that each was 
a man of the study rather than of the laboratory, but from this point 
on, their whole attitudes to science were diametrically opposite. 

Gibbs, mathematician though he was, always regarded mathe- 
matics as ancillary to physics. Lebesgue was an analyst of the purest 
type, an able exponent of the extremely exacting modera standards 
of mathematical rigor, and a writer whose works, as far as I know, do 
not contain one single example of a problem or a method originating 
directly from physics. Nevertheless, the work of these two men 
forme a single whole in which the questions asked by Gibbs find their 
answers, not in his own work but in the work of Lebesgue. 

The key idea of Gibbs is this: in Newton’s dynamics, in its original 
form, we are concemed with an individual system, with given initial 
velocities and momenta, undergoing changes according to a certain 
system of forces under the Newtonian laws which link forcé and 
acceíeration. In the vast majority of practical cases, however, we 
are far from knowing all the initial velocities and momenta. If we 
assume a certain initial distribution of the incompletely known 
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positions and momenta of the system, this will determine in a com- 
pletely Newtonian way the distribution of the momenta and positions 
for any future time. It wili then be possibie to make statements 
about these distributions, and some of these wiil ha ve the character 
of assertions that the future system wiil have certain characteristics 
with probability one, or certain other characteristics with probability 
zero. 

Probabilities one and zero are notions which inelude complete 
certainty and complete impossibility but inelude much more as well. 
If I shoot at a target with a bullet of the dimensions of a point, the 
chance that I hit any speoific point on the target will generally be 
zero, although it is not impossible that I hit it; and indeed, in each 
specific case I must actually hit some specific point, which is an 
event of probability zero. Thus an event of probability one, that of 
my hitting some point, may be made up of an assemblage of instances 
of probability zero. 

Nevertheless, one of the processes which is used in the technique 
of the Gibbsian statistical mechanics, although it is used implicitly, 
and Gibbs is nowhere clearly aware of it, is the resolution of a 
complex contingency into an infinite sequence of more special 
contingencies—a first, a second, a third, and so on—each of which 
has a known probability; and the expression of the probability of 
this larger contingency as the sum of the probabilities of the more 
special contingencies, which form an infinite sequence. Thus we 
cannot sum probabilities in all conceivable cases, to get a probability 
of the total event—for the sum of any number of zeros is zero— 
while we can sum them if there is a first, a second, a third member, 
and so on, forming a sequence of contingencies in which every term 
has a definite position given by a positivo integer. 

The distinction between these two cases involves rather subtle 
considerations conceming the nature of sets of instances, and Gibbs, 
although a very powerful mathematician, was never a very subtle 
one. Is it possibie for a class to be infinite and yet essentially 
different in multiplicity from another infinite class, such as that of 
the positivo integers? This problem was solved toward the end 
of the last century by Georg Cantor, and the answer is “Yes.” If 
we consider all the distinct decimal fractions, terminating or non- 
terminating, lying between O and 1, it is known that they cannot be 
arranged in 1, 2, 3 order—although, strangely enough, all the 
terminating decimal fractions can be so arranged. Thus the distinc¬ 
tion demanded by the Gibbs statistical mechanics is not on the face 
of it an impossible one. The Service of Lebesgue to the Gibbs 
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theory is to show that the implicit requirements of statistical 
mechanics concerning contingencies of probability zero and the 
addition of the probabilities of contingencies can actually be met, 
and fchat the Gibbsian theory does not invoive contradictions. 

Lebesgue’s work, however, was not directly based on the needs 
of statistical mechanics but on what looks like a very different 
theory, the theory of trigonometric series. This goes back to the 
eighteenth-century physics of waves and vibrations, and to the then 
moot question of the generaiity of the sets of motions of a linear 
System which can be synthesized out of the simple vibrations of the 
system—out of those vibrations, in other words, for which the passing 
of time simply multiplies the deviations of the system from equilib- 
rium by a quantity, positivo or negative, dependent on the time 
alone and not on position. Thus a single function is expressed as 
the sum of a series. In these series, the coefficients are expressed as 
averages of the product of the function to be represented, multiplied 
by a given weighting function. The whole theory depends on the 
properties of the average of a series, in terms of the average of an 
individual term. Notice that the average of a quantity which is 1 
over an interval from 0 to A, and 0 from A to 1, is A, and may be 
regarded as the probability that the random point should lie in the 
interval from 0 to A if it is known to lie between 0 and 1. In other 
words, the theory needed for the average of a series is very cióse to 
the theory needed for an adequate discussion of probabilities com- 
pounded from an infinite sequence of cases. This is the reason why 
Lebesgue, in solving his own problem, had also solved that of Gibbs. 

The particular distributions discussed by Gibbs have themselves 
a dynamical interpretation. If we consider a certain very general 
sort of conservative dynamical system, with N degrees of freedom, 
we find that its position and velocity coordinates may be reduced to 
a special set of 2 N coordinates, N of which are called the generalized 
position coordinates and N the generalized momenta. These 
determine a 2iV-dimensional space defining a 2iV-dimensional volume; 
and if we take any región of this space and let the points flow with 
the course of time, which changes every set of 2 N coordinates into a 
new set depending on the elapsed time, the continual change of the 
boundary of the región does not change its 2A r -dimensional volume. 
In general, for sets not so simply defined as these regions, the notion 
of volume generates a system of measure of the type of Lebesgue. 
In this system of measure, and in the conservative dynamical Systems 
which are transformed in such a way as to keep this measure constant, 
there is one other numerically valued entity which also remains 
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constant: the energy. If all the bodies in the system acfc only on 
one another and there are no forces attached to fixed positions and 
fixed orientations in space, there are two other expressions which also 
remain constant. Both of these are vectora: the momentum, and 
the moment of momentum of the system as a whoie. They are not 
difficult to eiiminate, so that the system is replaced by a system with 
fewer degrees of freedom. 

In highly specialized systems, there may be other quantities not 
determined by the energy, the momentum, and the moment of 
momentum, which are unchanged as the system develops. How- 
ever, it is known that systems in which another invariant quantity 
exista, dependent on the initial coordínales and momenta of a 
dynamical system, and regular enough to be subject to the system of 
integration based on Lebesgue measure, are very rare indeed in a 
quite precise sense. 1 In systems without other invariant quantities, 
we can fix the coordínales corresponding to energy, momentum, and 
total moment of momentum, and in the space of the remaining 
coordinates, the measure determined by the position and momentum 
coordinates will itself determine a sort of sub-measure, just as measure 
in space will determine area on a two-dimensional surface out of a 
family of two-dimensional surfaces. For example, if our family is 
that of concentric spheres, then the volume between two concentric 
spheres cióse together, when normalized by taking as one the total 
volume of the región between the two spheres, will give in the limit 
a measure of area on the surface of a sphere. 

Let us then take this new measure on a región in phase space for 
which energy, total momentum, and total moment of momentum are 
determined, and let us suppose that there are no other measurable 
invariant quantities in the system. Let the total measure of this 
restricted región be constant, or as as we can make it by a change in 
scale, 1. As our measure has been obtained from a measure invariant 
in time, in a way invariant in time, it will itself be invariant. We 
shall cali this measure phase measure , and averages taken with 
respect to it phase averages. 

However, any quantity varying in time may also have a time 
average. If, for example,/(í) depends on t, its time average for the 
past will be 

lim ~ P f(t)dt (2.01) 

r—oo 1 J-t 

1 Oxtoby, J. C., and S. M. Ulam,“ Measure-Preserving Homeomorphisms and Metri- 
cal Transitivity.” Ann. of Math., Ser. 2, 42, 874-920 (194!). 
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and its time average for the future 

hm i (2.02) 

In Gibbs’ statistical mechanics, both time averages and space 
averages occur. It was a brilliant idea of Gibbs to try to show that 
these two types of average were, in some sense, the same. In the 
notion that these two types of average were related, Gibbs was 
perfectly right; and in the method by which he tried to show this 
relation, he was utteriy and hopelessly wrong. For this he was 
scarcely to biame. Even at the time of his death, the fame of the 
Lebesgue integral had just begun to penétrate to America. For 
another fifteen years, it was a museum curiosity, oniy usefui to show 
to young mathematicians the needs and possibilities of rigor. A 
mathematician as distinguished as W. F. Osgood 1 wouid ha ve 
nothing to do with it till his dying day. It was not until about 1930 
that agroup of mathematicians—Koopman, von Neumann, Birkhoff 2 
—finally established the proper foundations of the Gibbs statistical 
mechanics. Later, in the study of ergodic theory, we shall see what 
these foundations were. 

Gibbs himself thought that in a system from which all the in¬ 
variante had been removed as extra coordinates almost all paths of 
points in phase space passed through all coordinates in such a space. 
This hypothesis he called the ergodic hypothesis, from the Greek 
words epyov, “work,” and oSó?, “path.” Now, in the first place, as 
Plancherel and others have shown, there is no significant case where 
that hypothesis is trae. No differentiable path can cover an area in 
the plañe, even if it is of infinite length. The followers of Gibbs, 
including at the end perhaps Gibbs himself, saw this in a vague way, 
and replaced this hypothesis by the quasi-ergodic hypothesis, which 
merely asserts that in the course of time a system generally passes 
indefinitely near to every point in the región of phase space deter- 
mined by the known invariante. There is no logical difficulty as to 
the trath of this: it is merely quite inadequate for the conclusions 
which Gibbs bases on it. It says nothing about the relative time 
which the system spends in the neighborhood of each point. 

Beside the notions of average and of measure —the average over a 
universe of a function 1 over a set to be measured and 0 elsewhere— 
whieh were most urgently needed to make sense out of Gibbs’ theory, 

1 Nevertheless some of Osgood’s early work representad an important step in the 
direction of the Lebesgue integral. 

2 Hopf, E., “Ergodentheorie,” Ergeb. Matk., 5, No. 2, Springer, Berlín (1937). 
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in order to appreciate the real significance of ergodic theory we need 
a more precise analysis of the notion of invariant, as well as the notion 
of transformation group. These notions were certainly familiar fco 
Gibbs, as his study of vector analysis shows. Nevertheless, it is 
possible to maintain that he did not assess them at their full philo- 
sophícal valué. Like his contemporary Heaviside, Gibbs is one of 
the scientists whose physico-mathematical acumen often outstrips 
their logic and who are generally right, while they are often unabie to 
explain why and how they are right. 

For the existence of any Science, it is necessary that there exist 
phenomena which do not stand isolated. In a world ruled by a 
succession of miracles performed by an irrational God subject to 
sudden whims, we should be forced to await each new catastrophe in 
a state of perplexed passiveness. We have a picture of such a world 
in the croquet game in Alice in Wonderland ; where the mallets are 
llamingos; the balls, hedgehogs, which quietly unroll and go about 
their own business; the hoops, playing-card soldiers, likewise subject 
to locomotor initiative of their own; and the rules are the decrees of 
the testy, unpredictable Queen of Hearts. 

The essence of an effeetive rule for a game or a useful law of 
physics is that it be statable in advance, and that it apply to more 
than one case. Ideally, it should represent a property of the system 
discussed which remains the same under the flux of particular 
circumstances. In the simplest case, it is a property which is 
invariant to a set of transformations to which the system is subject. 
We are thus led to the notions of transformation, transformation 
group, and invariant . 

A transformation of a system is some alteration in which each 
element goes into another. The modification of the solar system 
which occurs in the transition between time ti and time h is a 
transformation of the sets of coordinates of the planets. The 
similar change in their coordínate when we move their origin, or 
subject our geometric axes to a rotation, is a transformation. The 
change in scale which occurs when we examine a preparation under 
the magnifying action of a microscope is likewise a transformation. 

The result of following a transformation A by a transformation B 
is another transformation, known as the product or resultant BA. 
Note that in general it dependa on the order of A and B. Thus if A 
is the transformation which takes the coordínate x into the coordínate 
y, and y into —x, while z is unchanged; while B takes x into z, z into 
—x, and y is unchanged; then BA will take x into y , y into —z, and 
z into ~x; while AB will take x into z, y into — x, and z into ~y. If 
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AB and BA are the same, we shall say fchat A and B are permutable. 

Sometímes, bufc not always, the transformation A will not only 
carry every element of the system into an element but will have 
the property that every element is the result of transforming an 
element. In this case, there is a unique transformation A~ l , such 
that both AA~ l and A~ l A are that very special transformation 
which we cali I, the identity transformation, which transforms every 
element into itself. In this case we cali A~ l the inverse of A. It is 
clear that A is the inverse of A~ l , that I is its own inverse, and that 
the inverse of AB is B~ l A~ l . 

There exist certain sets of transformations where every trans¬ 
formation belonging to the set has an inverse, likewise belonging to 
the set; and where the resultant of any two transformations belonging 
to the set itself belongs to the set. These sets are known as trans¬ 
formation groups. The set of all translations along a line, or in a 
plañe, or in a three-dimensional space, is a transformation group; 
and even more, it is a transformation group of the special sort 
known as Abelian, where any two transformations of the group are 
permutable. The set of rotations about a point, and of all motions 
of a rigid body in space, are non-Abelian groups. 

Let us suppose that we have some quantity attached to all the 
elements transformed by a transformation group. If this quantity 
is unchanged when each element is changed by the same trans¬ 
formation of the group, whatever that transformation may be, it is 
called an invariant of the group. There are many sorts of such group 
invariants, of which two are especially important for our purposes. 

The first are the so-called linear invariants. Let the elements 
transformed by an Abelian group be the terms which we represent by 
x, and let f(x) be a complex-valued function of these elements, with 
certain appropriate properties of continuity or integrability. Then 
if Tx stands for the element resulting from x under the transforma¬ 
tion T, and if f(x) is a function of absoluto valué 1, such that 

f(Tx) = a(T)f(x) (2.03) 

where a(T) is a number of absoluto valué 1 depending only on T, we 
shall say that f(x) is a character of the group. It is an invariant of 
the group in a slightly generalized sense. If f(x) and g(x) are group 
characters, clearly f (x)g{x) is one also, as is [/(x)] -1 . If we can 
represent any function h(x) defined over the group as a linear 
combination of the characters of the group, in some such form as 

h(x) = ZA t Mx) (2.04) 
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where fk{x) is a eharacter of the group, and a k (T) bears the same 
reiation to f k (x) that a(T) does fco f(x) in Eq. 2.03, then 

h(Tx) = ^ (2.08) 

Thus if we can develop k(x) in terms of a set of group characters, 
we can develop h(Tx) for all T in terms of the characters. 

We ha ve seen that the characters of a group generate other 
characters under multipiication and inversión, and it may similarly 
be seen that the constant 1 is a eharacter. Multipiication by a 
group eharacter thus generates a transformation group of the group 
characters themselves, which is known as the eharacter group of the 
original group. 

If the original group is the translation group on the infinite line, 
so that the operator T changes x into x + T, Eq. 2.03 becomes 

f(x + T) - a(T)f(x) (2.00) 

which is satisfied if f(x) = e iXx , a(T) = e ixT . The characters will be 

the functions e iÁX , and the eharacter group will be the group of 
translations changing A into A + r, thus having the same structure 
as the original group. This will not be the case when the original 
group consists of the rotations about a circle. In this case, the opera¬ 
tor T changes x into a number between 0 and 2 ir, differing from 
x + T by an integral múltiple of 2n, and, while Eq. 2.06 will still 
hold, we have the extra condition that 

cc(T + 2ir) « a(T) (2.07) 

If now we put/(x) = e iAx as before, we shall obtain 

e**»* = 1 (2.08) 

which means that A must be a real integer, positive, negative, or 
zero. The eharacter group thus corresponde to the translations of 
the real integers. If, on the other hand, the original group is that 
of the translations of the integers, x and T in Eq. 2.05 are confined 
to the integer valúes, and e 4 ** involves only the number between 0 
and 2rr which differs from A by an integral múltiple of 2n. Thus the 
eharacter group is essentially the group of rotations about a circle. 

In any eharacter group, for a given eharacter/, the valúes of a(T) 
are distributed in such a way that the distribution is not altered 
when they are all multiplied by a(S), for any element S in the group. 
That is, if there is any reasonable basis of taking an average of these 
valúes which is not affeeted by the transformation of the group by 
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the multiplication of each transformation by a fixed one of its trans- 
formations, either a(T) is ahvays 1, or this average is invariant when 
multipíied by some number not 1, and must be 0. From this it may 
be concluded that the average of the product of any character by its 
conjúgate (which will also be a character) will ha ve the valué 1, and 
that the average of the product of any character by the conjúgate of 
another character will ha ve the valué 0. In other words, if we can 
express h(x) as in Eq. 2.04, we shall have 

Aic = average [h(x)fk(x)] (2.09) 

In the case of the group of rotations on a circle, this gives us 


directly that if 

/(*) = 2 a » e<ni 

(2.10) 

then 

i n- 



a„ = — J /(ije- 1 ” 1 dx 

(2.11) 


and the result for translations along the infinite Une is closely related 
to the fact that if in an appropriate sense 


/(*) - J“ a(A)e'«dA (2.12) 

then in a certain sense 

a(A) = ¿ (2.13) 


These results have been stated here very roughly and without a 
clear statement of their conditions of validity. For more precise 
statements of the theory, the reader should consult the following 
reference. 1 

Beside the theory of the linear invariante of a group, there is also 
the general theory of its metrical invariants. These are the systems 
of Lebesgue measure which do not undergo any change when the 
objeets transformed by the group are permuted by the operators of 
the group. In this connection, we should cite the interesting theory 
of group measure, due to Haar. 2 As we have seen, every group itself 
is a collection of objeets which are permuted by being multipíied by 
the operations of the group itself. As such, it may have an invariant 
measure. Haar has proved that a certain rather wide class of 

1 Wiener, N., The Fourier Integral and Certain of Its Applications, The 
University Press, Cambridge, Engiand, 1933; Dover Publications, Ine., N.Y. 

2 Haar, H., “Der Maesbegriff in der Theorie der Kontinuieriichen Gruppen.” 
Ann. of Math., Ser. 2, 34, 147-189 (1933). 
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groups does possess a uniquely determined invariant measure, 
definable in terms of the structure of the group itself. 

The most important applieation of the theory of the metricai 
invariants of a group of transformations is to show the justification of 
that interchangeabiiity of phase averages and time averages which, 
as we have already seen, Gibbs tried in vain to establish. The basis 
on which this has been accomplished is known as the ergodic theory. 

The ordinary ergodic theorems start with an ensemble E, which 
we can take to be of measure 1, transformed into itself by a measure- 
preserving transformation T or by a group of measure-preserving 
transformations T\ where — oo < A < oo and where 

T X T» = (2.14) 

Ergodic theory concerns itself with complex-valued functions f(x) 
of the elements x of E. In all cases, f(x) is taken to be measurable 
in x, and if we are concerned with a continuous group of trans¬ 
formations, f(T A x) is taken to be measurable in x and A simultaneously. 

In the mean ergodic theorem of Koopman and von Neumann, 
f(x) is taken to be of class L 2 ; that is, 

j B | f(*)\ 2 dx < co (2.15) 

The theorem then asserts that 

/*<*) = FTT ,2 /<r**> < 2 - 16 ) 

or 

}a(x) ~ij! f{T * x>dÁ (2 - n) 

as the case may be, converges in the mean to a limit f*(x) as N -> oo 
or A c», respectively, in the sense that 


lim 

f 1 /*(*) - /w(a:)| 2 dx = 0 

(2.18) 

,V—»oo 

Je 


lim 

A-*«> 

j E !/*(*) - Sa(x)Yíx = 0 

(2.19) 


In the “almost everywhere” ergodic theorem of Birkhoff, f(x) is 
taken to be of class L: which means that 



OROUPS AND STATISTICAL MECHANICS 55 


The functions /n(%) and /a(x) are defined as 
The theorem then States that, except for a 
measure 0, 

in Eqs. 2.16 and 2.17. 
set of valúes of x of 

and 

f*(x) = lim f N (i) 

(2.21) 


/*(*) = lim fÁ x ) 

A-*m 

(2.22) 

exist. 




A very interesting case is the so-called ergodic or metrically transitive 
one, in which the transformation T or the set of transforrnations T Á 
leaves invariant no set of points x which has a measure other than 
1 or 0. In such a case, the set of valúes (for either ergodic theorem) 
for which f* takes on a certain range of valué is almost always 
either 1 or 0. This is impossibie unless f*{x) is almost always 
constant. The valué which f*(x) then assumes almost always is 

}(x) dx (2.23) 

That is, in the Koopman theorem, we have the limit in the mean 
U* yi-J I f(T-x) = * (2-2*) 

and in the Birkhoíf theorem, we have 

l/C"*) = J„‘/(*) *= (2-2S) 

except for a set of valúes of x of zero measure or probability 0. 
Similar resulta hold in the continuous case. This is an adequate 
justificaron for Gibbs’ interchange of phase averages and time 
averages. 

Where the transformation T or the transformation group T Á is 
not ergodic, von Neumann has shown under very general conditions 
that they can be reduced to ergodic components. That is, except 
for a set of valúes of x of zero measure, E can be separated into a 
imite or denumerable set of classes E n and a continuum of classes 
E(y), such that a measure is estabiished on each E n and E(y), which 
is invariant under T or T A . These transforrnations are all ergodic; 
and if S(y) is the intersection of S with E(y) and S n with E n , then 

measure ( S ) = j measure [<S(i/)3 dy + T measure ( S n ) (2.26) 

E J B(y ) E n 
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In other words, the whole theory of measure-preserving transfor- 
mations can be reduced to the theory of ergodic transformations. 

The whole of ergodic theory, let us remark in passing, may be 
applied to groups of transformations more general than those 
isomorphic with the translation group on the line. In particular, it 
may be applied to the translation group in n dimensions. The case 
of three dimensions is physically important. The spatial analogue 
of temporal equilibrium is spatial homogeneity, and such theories as 
that of the homogeneous gas, liquid, or solid depend on the applica- 
tion of three-dimensional ergodic theory. Incidentally, a non- 
ergodic group of translation transformations in three dimensions 
appears as the set of translations of a mixture of distinct states, 
such that one or another exists at a given time, not a mixture 
of both. 

One of the cardinal notions of statistical mechanics, which also 
receives an application in the classical thermodynamics, is that of 
entropy. It is primarily a property of regions in phase space and 
expresses the logarithm of their probability measure. For example, 
let us consider the dynamics of n partióles in a bottle, divided into 
two parts, A and B. If m partióles are in A, and n - m in B, we 
have characterized a región in phase space, and it will have a certain 
probability measure. The logarithm is the entropy of the distribu- 
tion: m partióles in A, n — m in B. The system will spend most of 
its time in a state near that of greatest entropy, in the sense that for 
most of the time, nearly ni\ partióles will be in A, nearly n — m\ in 
B, where the probability of the combination m\ in A, n — mi in B is 
a máximum. For systems with a large number of partióles and states 
within the limits of practica! discrimination, this means that if we 
take a state of other than máximum entropy and observe what 
happens to it, the entropy almost always increases. 

In the ordinary thermodynamic problems of the heat engine, we 
are dealing with conditions in which we have a rough thermal 
equilibrium in large regions like an engine cylinder. The states for 
which we study the entropy are states involving máximum entropy 
for a given temperature and volume, for a small number of regions of 
the given volumes and at the given temperature assumed. Even 
the more refined discussions of thermal engines, partícularly of 
thermal engines like the turbine, in which a gas is expanding in a 
more complicated manner than in a cylinder, do not change these 
conditions too radically. We may still talk of local temperatures, 
with a very fair approximation, even though no temperature is 
precisely determined except in a state of equilibrium and by methods 
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involving this equilibrium. However, in living matfcer, we lose much 
of even this rough homogeneity. The sfcructure of a protein tissue as 
shown by the electrón microscope has an enormous definiteness and 
fineness of texture, and its physiology is certainly of a corresponding 
fineness of texture. This fineness is far greater than that of the space- 
and-time scale of an ordinary thermometer, and so the temperatures 
read by ordinary thermometers in living tissues are groas averages 
and not the true temperatures of thermodynamics. Gibbsian 
statistical mechanics may well be a fairly adequate model of what 
happens in the body; the picture suggested by the ordinary heat 
engine certainly is not. The thermal eñiciency of muscle action 
means next to nothing, and certainly does not mean what it appears 
to mean. 

A very important idea in statistical mechanics is that of the 
Maxwell demon. Let us suppose a gas in which the partióles are 
moving around with the distribution of velocities in statistical 
equilibrium for a given temperature. For a perfect gas, this is the 
Maxwell distribution. Let this gas be contained in a rigid container 
with a wall across it, containing an opening spanned by a small 
gate, operated by a gatekeeper, either an anthropomorphic demon 
or a minute mechanism. When a partióle of more than average 
velocity approaches the gate from compartment A or a partióle of 
less than average velocity approaches the gate from compartment B, 
the gatekeeper opens the gate, and the partióle passes through; but 
when a partióle of less than average velocity approaches from 
compartment A or a particle of greater than average velocity 
approaches from compartment B, the gate is closed. In this way, 
the concentration of partióles of high velocity is increased in com¬ 
partment B and is decreased in compartment A . This produces an 
apparent decrease in entropy; so that if the two compartments are 
now connected by a heat engine, we seem to obtain a perpetual- 
motion machine of the second kind. 

It is simpler to repel the question posed by the Maxwell demon 
than to answer it. Nothing is easier than to deny the possibility of 
such beings or structures. We shall actually find that Maxwell 
demons in the strictest sense cannot exist in a system in equilibrium, 
but if we accept this from the beginning, and so not try to demón¬ 
strate it, we shall misa an admirable opportunity to learn something 
about entropy and about possible physical, Chemical, and biological 
systems. 

For a Maxwell demon to act, it must receive information from 
approaching partióles conceming their velocity and point of impact 
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on the wall. Whether these impulses involve a fcransfer of energy 
or not, they must involve a coupiing of the demon and the gas. 
Now, the law of the increase of entropy applies to a completeiy 
isolated system but does not appiy to a non-isolated part of such a 
system. Accordingly, the only entropy which concerns us is that 
of the system gas-deinon, and not that of the gas alone. The gas 
entropy is merely one term in the total entropy of the larger system. 
Can we find terms involving the demon as well which contribute to 
this total entropy? 

Most certainly we can. The demon can only act on information 
received, and this information, as we shali see in the next chapter, 
representa a negative entropy. The information must be carried by 
some physical procesa, say some form of radiation. It may very 
well be that this information is carried at a very low energy level, 
and that the transfer of energy between partióle and demon is for a 
considerable time far less significant than the transfer of information. 
However, undcr the quantum mechanics, it is impossible to obtain 
any information giving the position or the momentum of a partióle, 
much less the two together, without a positive effect on the energy of 
the partióle examined, exceeding a minimum dependent on the 
frequency of the light used for examination. Thus all coupiing is 
strictly a coupiing involving energy, and a system in statistical 
equilibrium is in equilibrium both in matters concerning entropy and 
those concerning energy. In the long run, the Maxwell demon is 
itself subject to a random motion corresponding to the temperatura 
of its environment, and, as Leibniz says of some of his monads, it 
receives a large number of small impressions, until it falls into “a 
certain vértigo” and is incapable of clear perceptions. In fact, it 
ceases to act as a Maxwell demon. 

Nevertheless, there may be a quite appreciable interval of time 
before the demon is deconditioned, and this time may be so prolonged 
that we may speak of the active phase of the demon as metastable. 
There is no reason to suppose that metastable demons do not in fact 
exist; indeed, it may well be that enzymes are metastable Maxwell 
demons, decreasing entropy, perhaps not by the separation between 
fast and slow partióles but by some other equivalent process. We 
may well regard living organisms, such as Man himself, in this light. 
Certainly the enzyme and the living organism are alike metastable: 
the stable State of an enzyme is to be deconditioned, and the stable 
State of a living organism is to be dead. All cataiysts are ultimately 
poisoned: they change rates of reaction but not trae equilibrium. 
Nevertheless, cataiysts and Man alike have sufficiently definite 
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States of metastability to deserve the recognition of these States as 
relatively permanent conditions. 

I do not wish to cióse this cnapter without indicating that ergodic 
theory is a eonsiderably wider subject than we ha ve indicated above. 
There are certain modera developments of ergodic theory in which 
the measure to be kept invariant under a set of transformations is 
defined directiy by the set itself rather than assumed in advanee. I 
refer especiaily to the work of KrylofF and BogoiioubofF, and to some 
of the work of Hurewicz and the Japanese school. 

The next chapter is devoted to the statistical mechanics of time 
series. This is another field in which the conditions are very remóte 
from those of the statistical mechanics of heat engines and which is 
thus very well suited to serve as a model of what happens in the 
living organism. 



Time Series, Information, 
and Communication 


There is a large class of phenomena in which what Í8 observed is a 
numerical quantity, or a sequence of numerical quantities, dis- 
tributed in time. The temperature as recorded by a eontinuous 
recording thermometer, or the ciosing quotations of a stock in the 
stock market, taken day by day, or the complete set of meteorological 
data published from day to day by the Weather Bureau are al! time 
series, eontinuous or discrete, simple or múltiple. These time series 
are relatively slowly changing, and are well suited to a treatment 
employing hand computation or ordinary numerical tools such as 
slide rules and computing machines. Their study belongs to the 
more conventional parts of statistical theory. 

What is not generally realized is that the rapidly changing 
sequences of voltages in a telephone line or a televisión circuit or a 
piece of radar apparatus belong just as truly to the field of statistics 
and time series, although the apparatus by means of which they are 
combined and modified must in general be very rapid in its action, 
and in fact must be able to put out resuits pari passu with the very 
rapid alterations of input. These pieces of apparatus—telephone 
receivers, wave fiItera, automatie sound-coding devices like the 
Vocoder of the Bell Telephone Laboratories, frequency-modulating 
networks and their corresponding receivers—are all in essence 
quick-acting arithmetical devices, corresponding to the whole 
apparatus of computing machines and schedules, and the staff of 
computers, of the statistical laboratory. The ingenuity needed in 
their use has been built into them in advanee, just as it has into the 
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automatic range finders and gun pointers of an anti-aircraft fire- 
control system, and for the same reasons. The chain of operation 
has to work too fast to admit of any human links. 

One and all, time series and the apparatus to deal with them, 
whether in the computing iaboratory or in the telephone Circuit, have 
to deai with the recording, preservation, transmission, and use of 
information. What is this information, and how is it measured? 
One of the simplest, most unitary forms of information is the re¬ 
cording of a choice between two equaily probable simple alternatives, 
one or the other of which is bound to happen—a choice, for example, 
between heads and tails in the tossing of a coin. We shall cali a 
single choice of this sort a decisión. If then we ask for the amount of 
information in the perfectly precise measurement of a quantity 
known to lie between A and B, which may with uniform a priori 
probability lie anywhere in this range, we shall see that if we put 
.<4 = 0 and 5=1, and represent the quantity in the binary scale by 
the infinite binary number . a\ ai a* - ■ • a n • ■ •, where a\, a 2 , • ■ ■, each 
has the valué 0 or 1, then the number of choices made and the 
consequent amount of information is infinite. Here 

. ai az 0,3- ■ • a n - • • = 2 ai 22 0,2 2" an (3.01) 


However, no measurement which we actualiy make is performed 
with perfect precisión. If the measurement has a uniformly dis- 
tributed error lying o ver a range of length bibz - ■ ■ &„•••, where 
bk is the first digit not equai to 0 , we shall see that all the decisions 
from ai to a*_i, and possibly to a*, are significant, while all the 
later decisions are not. The number of decisions made is certainly 
not far from 

— log 2.61 62 • ■ • b n ■ ■ • (3.02) 

and we shall take this quantity as the precise formula for the amount 
of information and its definition. 

We may conceive this in the foilowing way: we know a priori that 
a variable lies between 0 and 1 , and a posteriori that it lies on the 
interval (a, b) inside (0, 1). Then the amount of information we 
have from our a posteriori knowledge is 


log 2 


measure of (a, 6 ) 
measure of ( 0 , 1 ) 


(3.03) 


However, let us now consider a case where our a priori knowledge is 
that the probability that a certain quantity should lie between x and 
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x + dx is fi(x) dx, and the a posteriori probability is f%(x) dx. How 
much new information does our a posteriori probability give us? 

This problem is essentially that of attaching a width to the regions 
under the curvea y — fi(x) and y = fz{x). It wili be noted that we 
are here assuming the variable to have a fundamenta! equipartition; 
that is, our resulta will not in general be the same if we replace x by x 3 
or any other function of x. Since fi(x) ia a probability density, we 
shall have 



so that the average logarithm of the breadth of the región under 
fi(x) may be considerad as some sort of average of the height of the 
logarithm of the reciprocal of fi(x). Thus a reasonable measure 1 of 
the amount of information associated with the curve fi(x) is 

J ^ [logí/lMl/lM dx (3.05) 

The quantity we here define as amount of information is the 
negative of the quantity usuaily defined as entropy in similar 
situations. The definition here given is not the one given by R. A. 
Fisher for statistical problema, although it is a statiatical definition, 
and can be used to replace Fisher’s definition in the technique of 
statistics. 

In particular, if fi(x) is a constant over (a, b) and is zero elsewhere, 

JI P<Wi<*>]/i <*> dx - r~ lo 8* = l0g! bhi <3 - 06) 

Using this to compare the information that a point is in the región 
(O, 1) with the information that it is in the región (o, 6), we obtain 
for the measure of the difference 

log 2 b ^ a - log 2 1 = log 2 b 2 - (3.07) 

The definition which we have given for the amount of information 
is applicable when the variable x is replaced by a variable ranging 
o ver two or more dimensions. In the two-dimensional case, f(x, y) is 
a function such that 



dyfi{x, y) = 


1 


(3.08) 


1 Here the author makes use of a personal communication of J. von Neumann. 
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and the amounfc of information is 


J ® Ax J « dy f'( x ’ v) y) 

(3.081) 

Note that iífi(x, y) is of the form <f>(x)ip(y) and 


fax) dx = J ^ iy = 1 

(3.082) 

then 


l dx dy = 1 

(3.083) 

and 


í*ao rao 

1 dx | ^dyft(x,y)log 2 Mx,y) 



L dx <¡>{x) log! 4>(x) + |^ dy ip(y) log 2 (3.084) 


and the amount of information from independent sources is additive. 

An interesting problem is that of determining the information 
gained by fixing one or more variables in a problem. For example, 
let us suppose that a variable u lies between x and x + dx with the 
probability exp (-x 2 ¡2a) dx¡Virria, while a variable v lies between 
the same two limits with a probability exp ( — x 2 ¡2b) dx¡ V2rrb. How 
much information do we gain conceming u if we know that 
u + v = In this case, it is clear that u = w — v, where w is 
fixed. We assume the a priori distributions of u and v to be inde¬ 
pendent. Then the a posteriori distribution of u is proportional to 

exp (-|) exp “ c ‘ ex P [-<* “ Cí)2 (^¿r)] < 3 ' 09 ) 

where C\ and ci are constants. They both disappear in the formula 
for the gain in information given by the fixing of w, 

The excess of information conceming x when we know w to be that 
which we have in advance is 

v i mu r m /” í exp h*" C2) f-¿r)]} 

x [_í io g2 2tt (^)] - <* - c >) 2 [(“¿ 6 )] lo « 2 e ] 

-vháll h ("sMH 10 » 2 * 0 -fa l0 ***) dx 

= Ílog 2 (?-±- 6 ) (3.091) 
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Note that fchis expression (Eq. 3.091) is positive, and that ifc is inde- 
pendent of w. It is one-ha!f the logarithm of the ratio of the sum of 
the mean squares of u and v to the mean square of v. If v has only a 
small range of variation, the amount of information conceming u 
which a knowledge of u + v gives is large, and it becomes infinite as 
b goes to 0. 

We can eonsider this result in the foliowing light: let us treat u as a 
message and v as a noise. Then the information carried by a precise 
message in the absence of a noise is infinite. In the presence of a 
noise, however, this amount of information is finite, and it approaches 
0 very rapidly as the noise increases in intensity. 

We ha ve said that amount of information, being the negative 
logarithm of a quanfcity which we may eonsider as a probability, is 
essentially a negative entropy. It is interesting to show that, on 
the average, it has the properties we associate with an entropy. 

Let <f>(x) and be two probability densities; then [<f>(x) + if/(x)]¡2 
is also a probability density. Then 

i" #») + •PM log <K*) + <K*) 

J- a) 2 2 

< lo g#*) + J a '^Y^oe<K x ) dx ( 3 - 10 ) 

This follows from the fact that 

lo 8 | (« >°g a + t> lo g h) (3- 11 ) 


In other words, the overlap of the regions under <f>{x) and ifj(x) 
reduces the máximum information belonging to <f)(x) + ^{x). On 
the other hand, if <f>(x) is a probability density vanishing outside 

{a, i), 


<f>(x) log <f>(x) dx 


(3.12) 


is a mínimum when >f>(x) — 1/(6 — a) over (a, b) and is zero elsewhere. 
This follows from the fact that the logarithm curve is convex 
upward. 

It will be seen that the processes which lose information are, as we 
should expect, closely anaiogous to the processes which gain entropy. 
They consist in the fusión of regions of probability which were 
originally distinct. For example, if we replace the distribution of a 
certain variable by the distribution of a function of that variable 
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which takes the same valué for different argumenta, or if in a function 
of several variables we allow some of them to range unimpeded o ver 
their natural range of variability, we lose information. No operation 
on a message can gain information on the average. Here we have a 
precise application of the second law of thermodynamics in com- 
munication engineering. Conversely, the greater specification of an 
ambiguous situation, on the average, will, as we have seen, generally 
gain information and never lose it. 

An interesting case is when we have a probability distribution 
with n-fold density f(x i, • • •, x n ) over the variables (xi, • - •, x n ), and 
where we have m dependent variables y\, - ■ ■, y m - How much 
information do we get by fixing these m variables? First let them be 
fixed between the limits yi*, y i* + dy i*; • ■ • ; y m * t y m 4- dy m *. Let 
us take as a new set of variables x\, X 2 , • • •, x n -m, y i, yz, • • •» ym- 
Then over the new set of variables, our distribution function 
will be proportional to f(x\, ■••,x n ) over the región R given by 
yi* < yi < yi* + dyi*, ■ ■ ■,y m * < y m < y m * + dy m * and 0 outside. 
Thus the amount of information obtained by the specification of the 
y’s will be 

•••,x„)log 2 /(xi, • • • ,x„) 


dz n f{Xi, - • ,X„) 


jdx i • • ■ jdx n f(x i, 


J dx \••• | 


dxi ■ • • J ^ dx n f{zi, • • • ,x„) Iog 2 /(xi, • • • ,x„) 


r 

J -OO , 

j" dx,-m 

1 jl \ 

1 \^n—m+l> ' • • ,Zn) 

xf(x 1. ■ ■ - ,X 

|~i 

i.) logj/(*l, • ■ -,x„) 

/> •] 

dXn—m 

jl yi*. • • ■.»»* \ 

‘ • * t Zn/ 

1 /(*».•••,*») 



dz n f(x i, • • - ,x„) log 2 /(xi, • • • ,x n ) 


(3.13) 

Closely related to this problem is the generalization of that which 
we diseussed in Eq. 3.13; in the case just discussed, how much 
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Information do we have conceming the variables x\, • • •, x n - m alone? 
Here the a priori probability density of these variables is 


dx n - 


dx n f{xi, • • •, x n ) 


(3.14) 


and the un-normalized probability density after fixing the y *'s is 


iu »*■ r/(*.. 

\Xn-m+U *• -,Zn/\ 


■,x n ) 


(3.141) 


where the 2 taken over all seta of points (x n _ m+ i, • ■ •, x n ) cor- 
responding to a given set of y*' s. On this basis, we may easily 
write down the solution to our problem, though it will be a bit 
lengthy. If we take the set x\ t • • •, x n -m) to be a generalized 
message, the set (x n -m+i, • • •, x m ) to be a generalized noise, and the 
y *'s to be a generalized corrupted message, we see that we have given 
the solution of a generalization of the problem of Expression 3.141. 

We have thus at least a formal solution of a generalization of the 
message-noise problem which we have already stated. A set of 
observations depends in an arbitrary way on a set of messages and 
noises with a known combined distribution. We wish to ascertain 
how much information these observations give conceming the mes¬ 
sages alone. This is a central problem of communication engineering. 
It enables us to evalúate different systems, such as amplitude 
modulation or frequency modulation or phase modulation, as far as 
their efficiency in transmitting information is concerned. This is a 
technical problem and not suitable to a detailed discussion here; 
however, certain remarks are in order. In the first place, it can be 
shown that with the definition of information given here, with a 
random “statie ” on the ether equidistributed in frequency as far as 
power is concerned, and with a message restricted to a definí te 
frequency range and a definite power output for this range, no means 
of transmission of information is more efficient than amplitude modu¬ 
lation, although other means may be as efficient. On the other hand, 
the information transmitted by this means is not necessarily in the 
form most suitable for reception by the ear or by any other given 
receptor. Here the specific characteristics of the ear and of other 
receptora must be considered by employing a theory very similar to 
the one just developed. In general, the efficient use of amplitude 
modulation or any other form of modulation must be supplemented 
by the use of decoding devices adequate to transforming the received 
information into a form suitable for reception by human receptora or 
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use by mechanical receptora. Similarly, the original message must be 
coded for the greatest compression in transmission. This problem 
has been attacked, at least in part, in the design of the “Voeoder” 
System by the Bell Telephone Laboratories, and the relevant general 
theory has been presented in a very satisfactory form by Dr. C. 
Shannon of those laboratories. 

So much for the definition and technique of measuring information. 
We shall now discuss the way in which information may be presented 
in a form homogeneous in time. Let it be noted that most of the 
telephone and other communication devices are actually not attached 
to a particular origin in time. There is indeed one operation which 
seems to contradicfc this, but which really does not. This is the 
operation of modulation. This, in its simplest form, converts a 
message f(t) into one of the form f(t) sin (at + b). If, however, we 
regard the factor sin (at + 6) as an extra message which is put into 
the apparatus, it will be seen that the situation will come under our 
general statement. The extra message, which we cali the carrier, 
adds nothing to the rate at which the system is carrying information. 
All the information it contains is conveyed in an arbitrarily short 
interval of time, and thereafter nothing new is said. 

A message homogeneous in time, or, as the statisticians cali it, a 
time series which is in statistical equilibrium, is thus a single function 
or a set of functions of the time, which forms one of an ensemble of 
such sets with a well-defined probability distribution, not altered by 
the change of t to t + r throughout. That is, the transformation 
group consisting of the operators T Á which change f(t) into f(t + A) 
leaves the probability of the ensemble invariant. The group 
satisfies the properties that 

r W(i) ]- T,» m {¡i»:;:*! (3..5) 

It follows from this that if <P[f(t )] is a “functional” of f(t )—that is, 
a number depending upon the whole history of f(t )—and if the 
average of f(t) over the whole ensemble is finite, we are in a position 
to use the Birkhoff ergodic theorem quoted in the previous chapter, 
and to come to the conclusión that, except for a set of valúes of 
f(t) of zero probability, the time-average of <£[/(f )]> or in symbols, 

Jim I £ í>[/(( + t)] dr = Jim I jjj <f[/(í + t)] dr (3.16) 

exists. 

There is even more here than this. We have stated in the previous 
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chapter another theorem of ergodic eharacfcer, due to von Neumann, 
which States that, except for a set of elementa of zero probability, 
any element beionging to a system which goes into itself under a 
group of measure-preserving transformations such as Eq. 3.16 
belongs to a sub-set (which may be the whoie set) which goes into 
itself under the same transformation, which has a measure defined 
over itself and also invariant under the transformation, and which 
has the further property that any portion of this sub-set with measure 
preserved under the group of transformations either has the máximum 
measure of the sub-set, or measure 0. If we discard all elementa ex- 
cepfc those of such a sub-set, and use its appropriate measure, we 
shall find that the time average (Eq. 3.16) is in almost all cases the 
average of #>[/(í)] over all the space of functions f(t)\ the so-called 
phase average. Thus in the case of such an ensemble of functions f(t ), 
except in a set of cases of zero probability, we can deduce the average 
of any statistical parameter of the ensemble—indeed we can simul- 
taneously deduce any countable set of such parameters of the 
ensemble—from the record of any one of the eomponent time series, 
by using a time average instead of a phase average. Moreover, we 
need to know only the past of almost any one time series of the class. 
In other words, given the entire history up to the present of a time 
series known to belong to an ensemble in statistical equilibrium, we 
can compute with probable error zero the entire set of statistical 
parameters of an ensemble in statistical equilibrium to which that 
time series belongs. Up to here, we have formulated this for single 
time series; it is equally true, however, for múltiple time series in 
which we have several quantities varying simultaneously, rather 
than a single varying quantity. 

We are now in a position to discuss various problems beionging to 
time series. We shall confine our attention to those cases where the 
entire past of a time series can be given in terms of a countable set of 
quantities. For example, for quite a wide class of functions f(t) 
(— co < t < co), we have fully determined/ when we know the set of 
quantities 

a» = f° eU n f(t) dt (n = 0, 1, 2, • • ■) (3.17) 

J -« 

Now let A be some function of the valúes of t in the future, that is, 
for arguments greater than 0. Then we can determine the simul- 
taneous distribution of (cío, ai, ■ • •, a n . A) from the past of almost 
any single time series if the set of/’s is taken in its narrowest possible 
sense. In particular, if a©, • • •, a n are all given, we may determine 
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the distribution of A. Here we appeal to the known theorem of 
Nikodym on conditional probabilities. The same theorem will 
assure us that this distribution, under very general circumstances, 
will tend to a limit as n -*■ oo and this limit will give us all the 
knowledge there is conceming the distribution of any future quantity. 
We may similarly determine the simultaneous distribution of valúes 
of any set of future quantities, or any set of quantities depending 
both on the past and the future, when the past is known. If then 
we have given any adequate interpretation to the ‘‘best valué” of 
any of these statistical parameters or sets of statistical parameters— 
in the sense, perhaps, of a mean or a median or a mode—we can 
compute it from the known distribution, and obtain a prediction to 
meet any desired criterion of goodness of prediction. We can 
compute the merit of the prediction, using any desired statistical 
basis of this merit—mean square error or máximum error or mean 
absolute error, and so on. We can compute the amount of informa- 
tion conceming any statistical parameter or set of statistical param¬ 
eters, which fixing of the past will give us. We can even compute 
the whole amount of information which a knowledge of the past will 
give us of the whole future beyond a certain point; although when 
this point is the present, we shall in general know the latter from the 
past, and our knowledge of the present will contain an infinite 
amount of information. 

Another interesting situation is that of a múltiple time series, in 
which we know precisely only the pasts of some of the components. 
The distribution of any quantity involving more than these pasts can 
be studied by means very similar to those already suggested. In 
particular, we may wish to know the distribution of a valué of another 
component, or a set of valúes of other components, at some point of 
time, past, present, or future. The general problem of the wave 
filter belongs to this class. We have a message, together with a 
noise, combined in some way into a corrupted message, of which we 
know the past. We also know the statistical joint distribution of 
the message and the noise as time series. We ask for the distribution 
of the valúes of the message at some given time, past, present, and 
future. We then ask for an operator on the past of the corrupted 
message which will best give this trae message, in some given statis¬ 
tical sense. We may ask for a statistical estímate of some measure 
of the error of our knowledge of the message. Finally, we may ask 
for the amount of information which we possess conceming the 
message. 

There is one ensemble of time series which is particularly simple and 
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centrai. This is the ensemble assoeiated with the Brownian motion. 
The Brownian motion is the motion of a partióle in a gas, impelled by 
the random impacts of the other partióles in a State of thermal 
agitation. The theory has been developed by many writers, among 
them Einstein, Smoluchowski, Perrin, and the author. 1 Unless we 
go down in the time scale to intervais so small that the individual 
impacts of the partióles on one another are discemible, the motion 
shows a curious kind of undifferentiability. The mean square 
motion in a given direction over a given time is proportional to the 
length of that time, and the motions o ver successive times are 
completeiy uncorrelated. This conforma closely to the physical 
observations. If we normaiize the scale of the Brownian motion to 
fit the time scale, and consider oníy one coordínate x of the motion, 
and if we let x(t) equal 0 for t = 0, then the probability that if 
0 ^ t\ < Í 2 < • • • < t n the partióles lie between x\ and xi + dx i at 
time íi, • • ■, between x n and x n + dx n at time t n , is 


.... *1 2 - *l) a 

(X„ - *„-i)21 

P 2h 2(¡ 2 - h) 

2 {U ~ l)J 


V\^7r)-h[h - (i) (in - ¡a-l)| 


On the basis of the probability system corresponding to this, 
which is unambiguous, we can make the set of paths corresponding 
to the different possible Brownian motions depend on a parameter a 
lying between 0 and 1, in such a way that each path is a function 
x(t, a), where x depends on the time t and the parameter of distribu- 
tion a, and where the probability that a path lies in a certain set S 
is the same as the measure of the set of valúes of a corresponding to 
paths in S. On this basis, almost all paths will be continuous and 
non-differentiable. 

A very interesting question is that of determining the average with 
respect to a of x(t i, a) - ■ x(t n , a). This will be 


j: 


da x(ti, a)x(t%, a) - • -xtyn, a) 

= (27r)-"/ 2 [/ 1 (Í2 - ti)- - {tn - 


xj d(i - j di n Í 1 Í 2 -• í» exp - 


2¿i 2(<2 — h) 

Un - 

2(í n - í«-l)J 


(3.19) 


1 Paley, R. E. A. C., and N. Wiener, “ FourierTransforma in the Complex Domain,” 
Colloquium Publications, Vol. 19, American Mathematical Soeiety, New York, 1934, 
Chapter 10. 




TIME SERIES, INFORMATION, AND COMMUNICATION 


71 


under the assuraption that 0 < ti < • • • ^ Let us put 

fl--í. = - fl )**.»-■■((«- f.-l Y‘-‘ (3.20) 

where A*,i + A*, 2 + ■ • ■ + A*,« = n. The valué of the expression 
in Eq. 3.19 wül become 

2 ^t(2»)-/.[í,»...(Í2 - - tn-lY‘''Y' h - 

x nr« díí “' exp [-2i7^r)] 

- ^ ex P (-f) d¡i ‘¡ - 

{ 0 if any Á k j is odd 

2*4*fl < A *-í - 1 M A *,< - 3 )- ■ 8-3 (1, - (3.21) 

* i 

if every X k j is even, 

— ^ A k ]~J (number of ways of dividing A*j terms into pairs) 
* j x (<> ~ U-i) y ' 

= 2 Aic (numbers of ways of dividing n terms into pairs 
k whose elements both belong in the same group of 

Afcj terms into which A is separated) x (tf — tj~\) v * 


?*’Zn j: 


da [ x(t k , a) - x(t k - 1 , a)Jx(t q , a) - x(t 9 -i, a)] 


Here the first £ sums over j ; the second, over all the ways of dividing 
n terms in blocks, respectively, of A*,i, • • •, A*, n numbers into pairs; 
and the fl is taken over those pairs of valúes k and q, where X k ,i of 
the elements to be selected from t k and t q are f i, X k ,2 are ig, and so on. 
It immediately results that 


J da x(t\ y a)x{t 2 , a)- • • x(t n , a) = |~J J x ^i> a ) z (tk, «) (3.22) 

where the 2 i 8 taken over all partitions of ti, • • ■, t n into distinet 
pairs, and the fl over all the paira in each partition. In other words, 
when we know the averages of the produets of x{t¡, a) by pairs, we 
know the averages of all polynomials in these quantities, and thus 
their entire statistical distribution. 

Up to the present, we have considered Brownian motions x(t, a) 
where t is positive. If we put 


g(t, a, jS) = x(t, a) (í > 0) 

{(t,a,p) =x(-t,l 3) (t< 0) 


(3.23) 
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where a and /3 ha ve independent uniform distributions o ver (0, 1), 
we shall obtain a distribution of £(t, a, fi) where t runa over the whole 
real infinite line. There is a weli-known mafchematicai device to 
map a square on a line segment in such a way that area goes into 
length. Aü we need to do is to write our coordinates in the square 
in the decimal form: 


and to put 


a - .a\a.2' • • «„•••'! 
p = -Pifo- ■] 

y = aipi& 2 p 2 - ■ ■ U„f3„- ■ ■ 


(3.24) 


and we obtain a mapping of this sort which is one-one for almost ail 
points both in the line segment and the square. Using this substitu- 
tion, we define 

y) - Z(t, a, f}) (3.25) 

We now wish to define 


/: 


K(t)d£{t, y) 


(3.26) 


The obvious thing would be to define this as a Stieltjes 1 integral, but 
£ is a very irregular function of t and does not make such a definition 
possibie. If, however, K runa sufficiently rapidly to 0 at ± co and is 
a sufficiently smooth function, it is reasonable to put 


K(t) d((t, y) = - £ K’(í m, y)dt (3.27) 

Under these circumstances, we have formally 

\ l dy f K x (t)dW,y) r Kmm.y) 

J 0 J -oo J-co 

— Í dy í Ki(t)g(t t y)dt Í y) dt 

JO J-OO J-ao 

= J” Ki'(s) ds dt £ f(s, y)í((, y) dy (3.28) 

Now, if s and t are of opposite signs, 


y)€(t, y) dy = 0 


(3.29) 


1 Stieltjes, T. J. Armales de la Fae. des Se. de Toulouse, 1894, p. 165 ; Lebesgue, H., 
Lefons sur l'Intégration, Gauthier-Villars et Cié, París, 1928. 
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whiie ¡f fchey are of the same sign, and jsj < jíj, 

J o f(*. y) dy = J o *(|«|.“W|<|>“)á“ 

roo roo r 

I du | dv uv exp - 


WRRFr - 1*1) 


= v¿ñC M2exp (-£)"“ 


W 2 {V — U) z 

W\ ~ 2(}í| - \s\) 


(3.30) 


Thus: 


£ dy y) J" Kl(t) W , y) 

- - J* ds J” (.£,'(() di - J" iC 2 '(s) ds J* tK\(t) dt 

+ J K¡'(s)ds j' tKz'(t) di + j Ki'{s) da tKd{t)dt 

— — J Ki'(s)ds jó-ATa(*> — J Kz{t) dt\ 


-J 

[ Kz(s)ds | 
r° 

s-M*) - JJ i 
r 

t .(o*] 

ro i 

+ J 

1 Ki(s)ds 

—co 

fO 

^-«./^(s) - J 

r 

r° i 

+ J 

| Kz(s) ds 

|^-s^i(«) - 

[ *i(0*J 


= - sd [ff,(s)tf 2 (*)] = f” KiWKfa) ds 
J- 00 J-OO 

In particular, 

Jo J-oo ^ + TI ^ J^ ^ + T2 ) <*£(*» y) 


(3.31) 


: J* ür(s)üT(s + T 2 — ti) ds (3.32) 
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Moreover, 

j>fl J“ K(t + r t )áí((,y) 

= 2 n f_2 KwK<s+r > - t *> ds ( 3 - 33 ) 

where the sum is over all partitions of ti, • • •, t„ into paire, and the 
product is over the pairs in each partition. 

The expression 

K(l + r) di( r, y) - /(í, y) (3.34) 


represents a very important ensembie of time series in the variable t, 
depending on a parameter of distribution y. We have just shown 
what amounts to the statement that all the moments and henee all 
the sta£istical parameters of this distribution depend on the function 



K(s)K(s + r) ds 

K(s + t)K{s + t + r)ds 


(3.35) 


which is the statisticians’ autocorrelation function with lag r. Thus 
the statistics of distribution of f(t, y) are the same as the statistics of 
f{t + ¿i, y); and it can be shown in fact that if 

f(t + U, y) = f(t, r) (3.36) 

then the transformation of y into j T preserves measure. In other 
words, our time series f{t, y) is in statistical equilibrium. 

Moreover, if we consider the average of 

[/I , K(i ~ t) <íí(í ' y) ] m [H K(t+ ° - t) m ’ y> ] ” <3 - 37) 

it will consist of precisely the terms in 

J 0 ‘ dy [ K(t - r) m, y)] ” J 0 ‘ dy [ J_” K(t + o- r) m, Y)] ” 

(3.38) 

together with a finite number of terms involving as factore powera of 
K(a + r )K(t) dr 


(3.39) 
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and if this approaches 0 when a ->• oo, Expression 3.38 will be fche 
límit of Expression 3.37 under these circumstances. In other words, 
f(t, y) and f(t + a , y) are asymptoticaliy independent in their dis- 
tributions as a~> co. By a more generaliy phrased but entirely 
similar argument, it may be shown that the simultaneous distribu- 
tion of /(<i, y), •••,/(*», y) and of f(a + s u y), -• •,/(<* + s m , y) 
tends to the joint distribution of the first and the second set as 
cr—> oo. In other words, any bounded measurable functional or 
quantity depending on the entire distribution of the valúes of the 
function of t, f(t,y), which we may write in the form y)], 

must have the property that 

lim £ y)WW + ®. y)l dy = {J o ‘ y)) ¿y}* (3.40) 

If now 2F\f{t , y)] is invariant under a translation of t, and only takes 
on the valúes 0 or I, we shall have 

J 0 ‘ &U(‘, y)] dy = J o ' j JF[/(í, y)] dyj 2 (3.41) 

so that the transformation group of f(t, y) into f(t + a, y) is metri- 
cálly transitive. It follows that if y)] is any integrable 

functional of / as a function of t, then by the ergodic theorem 

f V[/«,y)]dy = lim i \ T nm, y)]dt 

Jo T-+CD -L Jo 

= |im iJ_° T ^[/(/,y)]dí 


(3.42) 


for all valúes of y except for a set of zero measure. That is, we can 
almost always read off any statistical parameter of such a time series, 
and indeed any denumerable set of statistical parameters, from the 
past history of a single example. Actually, for such a time series, 
when we know 


lim 


f J_>'y>/<‘- 


r, y) dt 


(3.43) 


we know <P(t ) in almost every case, and we have a complete statistical 
knowledge of the time series. 

There are certain quantities dependent on a time series of this 
sort which have quite interesting properties. In particular, it is 
interesting to know the average of 


exp i J^K(l)d((t, y) 


(3.44) 
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Formally, this may be written 

-1 (2m - i)<2m - 3, --- s 31 

-íesío*-}- 

- (3.45) 

It is a very interesfcing problem to try to build up a time series as 
general as possible from fche simple Brownian motion series. In such 
constructions, the example of the Fourier developments suggests 
that such expan8ions as Expression 3.44 are convenient building 
bloeks for this purpose. In particular, let us investígate time series 
of the special form 

J* d\ exp [¿ J“ K(t + r, A) d{( t, y)j (3.46) 

Let us suppose that we know £(r, y) as well as Expression 3.46. 
Then, as in Eq. 3.45, if ti > t z , 


J o dy exp y) - ¿(< 2 , y)]j 

x J d\ exp J* K(t + t. A) d£(t, y)j 

- J dÁ exp | ^ J [K(t 4- r, A)] 2 dt 

- J (Í 2 - h) - s j*' K(t, A) díj (3.47) 

If we now multiply by exp [s 2 (f 2 — *i)/2}, let s(t z - ¿i) = xa, and 
let ¿2 íi, we obtain 

J S d A exp j - i J” [K{t + r. A)]* dt - i<,K(t¡, A) j (3.48) 

Let us take K{t\, A) and a new independent variable /x and sol ve for 
A, obtaining 

A = Q(h, n) (3.49) 
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Then Expression 3.48 becomes 

¡Ku 'l ' e '"" ^ cxp (■- \ m + T ’f *» 2 *) < 3 - s0) 

From this, by a Fourier transformation, we can determine 

exp (-i J_” {K[t + r, Q(t u #»)]}* di) (3.51) 

as a function of ¡i when lies between K(t\, a) and K{t\, b). If we 
intégrate this function with respect to ¡j., we determine 

J* dA exp | -i J“ [ K(t + r, A)]2 di) (3.52) 

as a function of K(t\, A) and íi. That is, there is a known function 
F(u, v), such that 

J'dAexp j-i J“ [Ktf + r,A)]2di) = F[K(t¡, A), i,) (3.53) 

•Since the left-hand side of this equation does not depend on ti, we 
may write it G( A), and put 

F[K(h, A), h] = G( A) (3.54) 

Here, F is a known function, and we can invert it with respect to 
the first argument, and put 

K(t u A) = H[G(\), h] (3.55) 

where it is also a known function. Then 

G{ A) = | A dA exp ( J” {H[G( A), I + r]}* di) (3.56) 
Then the function 

exp { - í J“ [H(u, I )]2 di) = R(u) (3.57) 

will be a known function, and 


(3.59) 
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or 

A = f + const = S(G ) -f const (3.60) 

J •»(<*) 

This constant will be given by 

G(á) = 0 (3.61) 

or 

a = S( 0) + const (3.62) 

It is easy to see that if a is finite, ifc does not matter what valué we 
give it; for our operator is not changed if we add a constant to all 
valúes of A. We can henee make it 0. We have thus determined 
A as a function of G, and thus G as a function of A. Thus, by Eq. 
3.66, we have determined K(t, A). To finish the determination of 
Expression 3.46, we need only know b. This can be determined, 
however, by a comparison of 

J‘ d,X exp j - i J“ [Kit, A)]* rfíj (3.63) 

with 

dy J * dX exp J" K{t, X) <¡f (/, y)j (3.64) 

Thus, under certain circumstances which remain to be definitely 
formulated, if a time series may be written in the form of Expression 
3.46 and we know £(t, y) as well, we can determine the function 
K(t, A) in Expression 3.46 and the numbers a and b, except for an 
undetermined constant added to a, A, and b. There is no extra 
difficulty if 6 = +oo, and it is not hard to extend the reasoning to 
the case where a — — oo. Of eourse, a good deal of work remains to 
be done to discuss the problem of the inversión of the functions 
inverted when the resulte are not single-valued, and the general 
conditions of validity of the expansions concemed. Still, we have 
at least taken a first step toward the solution of the problem of 
reducing a large class of time series to a canonical form, and this is 
most important for the concrete formal application of the theories of 
prediction and of the measurement of information, as we have 
sketched them earlier in this chapter. 

There is still one obvious limitation which we should remove 
from this approach to the theory of time series: the necessity which 
we are under of knowing £(í, y) as well as the time series which we are 
expanding in the form of Expression 3.46. The question is: under 
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what circumstances can we represent a time series of known statistical 
parameters as determined by a Brownian motíon; or at least as the 
limit in some sense or other of time series determined by Brownian 
motions? We shaii confine ourselves to time series with the property 
of métrica! transitó vity, and with the even stronger property that if 
we take intervals of fixed length but remóte in time, the distributions 
of any functionals of the segmenta of the time series in these intervals 
approach independence as the intervals recede from each other. 1 The 
theory to be developed here has already been sketched by the 
author. 

If K(t) is a sufficiently continuous function, it is possible to show 
that the zeros of 

J" K(t +T)dí<T,y) (3.65) 

almost always have a definite density, by a theorem of M. Kac, and 
that this density can be made as great as we wish by a proper 
choice of K. Let Ku be so selected that this density is D. Then 

the sequence of zeros of J Ku(t + r) d£(r, y) from — oo to oo will 

be called Z n (D, y), - oo < n < oo. Of course, in the numeration of 
these zeros, n is determined except for an additive constant 
integer. 

Now, let T{t, ¡x) be any time series in the continuous variable t, 
while \x is a parameter of distribution of the time series, varying 
uniformly over (0, l). Then let 

T D (t, n, y) = T[t - Z n (D , y), fx] (3.66) 

where the Z n taken is the one just preceding t. It will be seen that 
for any finite set of valúes ti, tz, • • •, t v of x the simultaneous dis¬ 
tribution of T D (t M , fx, y) (k — 1, 2, • • v) will approach the simul¬ 
taneous distribution of T(t«, fi) for the same t K ' s as D —► oo, for almost 
every valué of ¡x. However, T D (t, /x, y) is completely determined by 
t, ¡x, D, and £(r, y). It is therefore not inappropriate to try to 
express Tu (f, fx, y), for a given D and a given ¡x, either directly in the 
form of Expression 3.46 or in some way or another as a time series 
which has a distribution which is a limit (in the loose sense just given) 
of distributions of this form. 

It must be admitted that this is a program to be carried through 
in the fu ture, rather than one which we can consider as already 

1 This is the mixing property of Koopman, which is the necessary and sufficient 
ergodic assumption to justify statistical mechamos. 
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accomplished. Nevertheless, it is tfae program which, in the 
opinión of the author, offers the best hope for a rationai, consistent 
treatment of the many problema assoeiated with non-linear prediction, 
non-linear filtering, the evaluation of the transmission of information 
in non-linear situations, and the theory of the dense gas and tur- 
bulence. Among these problema are perhaps the most pressing 
facing communication engineering. 

Let ua now come to the predíction problem for time series of the 
form of Eq. 3.34. We see that the only independent statistical 
parameter of the time series is &(t), as given by Eq. 3.35; which 
means that the only significant quantíty connected with K(t) is 


K(s)K(s + t) ds 


(3.67) 


Here of course K is i 
Let us put 


m = J_” *(c 


>)e tw * da* 


(3.68) 


employing a Fourier transformation. To know K(s) is to know 
k(oj), and vice versa. Then 

J” K(é)K(» + r )ds = J“ dw (3.69) 

Thus a knowledge of 0(r) is tantamount to a knowledge of 
£(<u)i(-w). Since, however, K(s) is real, 


K{s) - J" düj (3.70) 

whence k((o) = t(-a). Thus ¡ifc(cu)j2 is a known function, which 
means that the real part of log |&(o>)| is a known function. 

If we write 

F(w) = #{log[*(ctf)]} (3.71) 

then the determination of K(s) is equivalent to the determination of 
the imaginary part of log k(w). This problem is not determínate 
unless we put some further restriction on k(w). The type of restric- 
tion which we shall put is that log k(s) shall be analytic and of a 
sufficiently small rate of growth for u> in the upper half-plane. In 
order to make this restriction, fc(cu) and [¿(oj)] - 1 will be assumed to 
be of algebraic growth on the real axis. Then [Fio»)] 2 will be even 
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and afc most logarithmicaliy infinite, and the Cauchy principal valué 
of 


GM = 



U — W 


du 


(3.72) 


will exist. The transformation indicated by Eq. 3.72, known as the 
Hilbert transformation, changes eos Áu> into sin Aa» and sin Áto into 
— eos Act). Thus F(cu) + iG(to) is a function of the form 


d[ií<A)j 


(3.73) 


and satisfies the required conditions for iog | Ar(cu) j in the lower 
half-plane. If we now put 


k(io) = exp [F(co) + iG(u>)] (3.74) 


it can be shown that k(w) is a function which, under very general 
conditions, is such that K(s), as defined in Eq. 3.68, vanishes for all 
negative argumenta. Thus 

/(i, y) = J" K(t + t) <¿£(t, y) (3.76) 

On the other hand, it can be shown that we may write l/k(cu) in the 
form 

lim f °° e**” dN n (X) (3.76) 

«—«o Jo 

where the N n ’s are properly determined; and that this can be done 
in such a way that 

£(t, y) = fim J ^ Q n (t + a)f{a , y) da (3.77) 

Here the Q n ’s must have the formal property that 

f{t, y) — lim J ^ K{t + r)dr j Q n (r + cr)f(o, y) da (3.78) 
In general, we shall have 

*¡>(1) — lim J ^ K(t + r)dr J Q n (r + cr)^i(a) do 


(3.79) 
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or if we write {as in Eq. 3.68) 


then 

Thus 


* W -J 

| k( co)e iw * dw 
—00 
peo 


Q.if) = j 

1 q n (io)e iu * dw 

poo 


m = J 

W(tü)e iwt dw 

—00 

(3.80) 

P(») = lim (2 1 r) i ''¥'(< u ) ? „(-<ü)J:(< u ) 

(3.81) 

lim }„(- 

w) (2„)**<«) 

(3.82) 


We shall find this result useful in getting the operator of prediction 
into a *form concerning frequency rather than time. 

Thus the past and present of g(t, y), or properly of the “differen- 
tial” d£(t, y), determine the past and present of f{t,y), and vice 
versa. 

Now, if A >0, 

/(* + A, y) = f K(t + A + t) dfír, y) 

J-t-A 

— f K(t + A + t) d£{r, y) 

J-t-A 

+ J" K(t + A + r) ¿í(x, y) (3.83) 


Here the first term of the last expression depends on a range of 
d£(r, y) of which a knowledge of/(<r, y) for a < t telis us nothing, and 
is entirely independent of the second term. Its mean square valué is 


P [K(t 4 - A + r)]2¿r = [ A [K{r)Ydr 
J-t-A JO 


(3.84) 


and this tells us ail there is to know about it statistically. It may 
be shown to have a Gaussian distribution with this mean square 
v.aiue. It is the error of the best possible prediction of/(í 4- A, y), 
The best possible prediction itself is the last term of Eq. 3.83, 


J* ^ K{t + A 4- r) d£(r, y) 

= lim J* K(t + A + r)dr j Q n {r 4- a)f{o, y) da (3.85) 
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If we now put 

M") = K(t + A)e->~‘ dt (3.86) 

and if we apply the operator of Eq. 3.85 to e iwt , obtaining 


lim J K(t 4 A + r)dr j Q n (r + a)e iwo da = A(cu)e twt (3.87) 

we shall find out (somewhat as in Eq. 3.81) that 
A(oi) « lim (2? r)* q n (-<x>)k A {w) 

« k A (w)lk(w) 

- ¡a e ~‘“ U ~ A> * J” (3.88) 

This is then the frequency form of the best prediction operator. 

The problem of filtering in the case of time series such as Eq. 3.34 
is very closely allied to the prediction problem. Let our message 
plus noise be of the form 

m(t) + n(f) = J" K( r) d(tf - r, y) (3.89) 

and let the message be of the form 

Cao 

R(t) d¿{t - T, 8) (3.90) 


Q( r ) d£{t - r, y) + 

stributed independe] 
lie m(t + a) is clearlj 

J o Q(t + a) d£(l - r, y) 


where y and 8 are distributed independently over (0, 1). Then the 
predictable part of the m(t + a) is clearly 


(3.901) 


and the mean square error of prediction is 
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Furthermore, let us suppose that we know the foilowing quantities: 


$ 22(í) = 

j" dy J dú n(t + t)íí(t) 


- 

[” [*(1*1 + r) - «<|í| + t)][K(t) - Q(r)]dr 


= 

j'"[*(|*l + r) - <3(|í| + t)][*(t) - Q(r)]dr 



+ J_° [*(|<| + r) - 0(|l| + T)]l-Q(r))dr 

r« 

+ I Q(\t ¡ + ‘r)Q('r)dr + 1 i?(j¿| + r)i?(r) dr 


- 

í K(\t\ 4- t)Ül(t) dr — í ifíjíl + r)Q(r)dr 

JO J-|<| 



+ J <2(j*¡ + r)Q(r)dr + J jR(jí{ 4- T)R(r)dr 

(3.903) 

^u(t) = 

J dy J* dSm(¡¿¡ + r)m(r) 


= 

J_“ Q(\t\ + r)Q(r)dr + J_" R(\t\ 4 r)R(r) dr 

(3.904) 

Íiz(t) = 

J dy J d8 m(t + r)n(r) 


- 

J dy J* dh m(t 4- r)[m(r) + w(r)] — <f>n(r) 


- 

Jo J t + ^ - a,y }§ t “ CT > y) * 

- <f>u(r) 

= 

J ^ K(t + r)Q(r) dr - <f>n(r) 

(3.906) 

The Fourier transforma of these three quantities are, respectiveiy, 

<Pn{cu) = 
012(0») = 

I ¿MI 2 + |íM! 2 - q(ü>)k{u.) - + |r(o.)|«1 

lí(")l 2 + |>-M| 2 

*(<">?(<"> - \lM\* - l'Ml > 

(3.906) 
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where 


That is, 
and 



1 

“ 2rr 

K(s)e-<- 

“* ds 

q(u.) 

1 

2 TT 

[“ W)? 1 

ü¡ ds 

r(<u) 

1 

2 ir 

[" i?(s)e-< 

"»ds 


$ll(tt>) + ^I2(t«) + Q\iito) 4- Qzíito) — |^(w)j 2 
q{u))k{io) — í£ii(<u) 4- Qz\{to) 


(3.907) 


(3.908) 

(3.909) 


where for symmetry we write $ 2 i(w) — 0i2(^). We can now 
determine k{w) from Eq. 3.908, as we have defined k(w) before on the 
basisof Eq. 3.74. Here we put &(t) for $u(<) + <£ 22(0 4- 2^[<Pn{t)}. 
This will give us 


Henee 


q(oj) 


Quito) + Qujto) 


Qit) 


r°° 

J-* H^) 


k(w) 

Quito) 


e íwí doj 


(3.910) 


(3.911) 


and thus the best determination of m(í), with the least mean square 
error, is 


f“ d£(t - r, y) f" *Mfe ¡ L±— ‘ Ézl (.<««+«) dw (3.912) 
lc(a>) 


Combining this with Eq. 3.89, and using an argument similar to the 
one by which we obtained Eq. 3.88, we see that the operator on 
mit) 4- n(t) by which we obtain the “best” representation of 
m{t 4 - a), if we write it on the frequeney scale, is 


P dt P . *» . (» . ) e<«< du (3.913) 

This operator constitutes a characteristic operator of what eléc¬ 
trica! engineers know as a wave filter. The quantity a is the lag of 
the filter. It may be either positive or negative; when it is negative, 
— a is known as the lead. The apparatus corresponding to Expres- 
sion 3.913 may always be constructed with as much accuracy as we 
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like. The details of its construction are more for the specialist in 
electrical engineering than for the reader of this book. They may be 
found elsewhere. 1 

The mean square filtering error (Expresión 3.902) may be 
represented as the sum of the mean square filtering error for infinite 
lag: 




= 0„(o) 


- J>« 


Wdr 


J* <Pu(üj) dw ■ 


dio 


j_ r 

2tt J_«o 


#u(<*>) 


l r® j #n(n>) 4- # 2 i(a>) j 2 

J-co I k(ü) I 

_1 #ll(nd 4- #2l(c»d¡ 2 

#u(w) 4- # 12 ( 0 ») 4- # 2 i(o>) 4- # 22 (tu) 


doi 


■ r 

#ll(<n) #i2(<n) 

02 l(<n) # 22 (ü>) 


2tt J-OO #u(íü) 4- #!2(cn) 4- #21 (to) + # 22 (o>) 


do> 


(3.914) 


and a part dependent on the lag: 

ra ra I # n (a>) 4- 0 21 ( 0 )) 

-J>l£— m— 

It will be seen that the mean square error of filtering is a monotonely 
decreasing function of lag. 

Another question which is interesting in the case of messages and 
noises derived from the Brownian motion is the matter of rate of 
transmission of information. Lct us consider for simplicity the 
case where the message and the noise are incoherent, that is, when 


2 (3.915) 


#12(íu) = #21 (tu) = 0 
In this case, let us consider 

J ^ - T, y) 

n(t) = J* N(r)dt(t- r, 8) 


(3.916) 


(3.917) 


where y and 8 are distributed independently. Let us suppose we 
know m(t) 4 - n{t) over (— A, A); how much information do we have 
concerning Note that we should heuristically expect that it 

1 We refer espeeially to recent papera by Dr. Y. W. Lee. 
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would not be very different from the amount of Information eon- 
eerning 

J' 4 J/(r)dí(í - r,y) (3.918) 

which we have when we know all valúes of 


T M(r) d{(t - r, y) + r N(r) d$(t - r, 8) (3.919) 

J-A J-A 

where y and 8 have independent distributions. It can, however, 
be shown that the nth Fourier coefficient of Expression 3.918 has a 
Gaussian distribution independent of all the other Fourier coefficients, 
and that its mean square valué is proportional to 

j J M( r) exp |'i dr | (3.920) 


Thus, by Eq. 3.09, the total amount of information available con- 
cerning M is 



(3.921) 


and the time density of communication of energy is this quantity 
divided by 2A, If now A -*• oo, Expression 3.921 approaches 



This is preciseiy the result which the author and Shannon have 
already obtained for the rate of transmission of information in this 
case. As will be seen, it dependa not only on the width of the 
frequency band available for transmitting the message but also on 
the noise level. As a mattcr of fact, it has a cióse relation to the 
audiograms used to measure the amount of hearing and loas of 
hearing in a given individual. Here the abscissa is frequency, the 
ordinate of lower boundary is the logarithm of the intensity of the 
threshold of audible intensity—what we may cali the logarithm of 
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the intensity of the intemal noise of the receiving system—and the 
upper boundary, the Iogarithm of the intensity of the greatest 
message the system is suited to handle. The area between them, 
which is a quantity of the dimensión of Expression 3.922, is then taken 
as a measure of the rate of transmission of Information with which 
the ear is competent to cope. 

The theory of messages depending linearly on the Brownian 
motion has many important variants. The key formulae are Eqs. 
3.88 and 3.914 and Expression 3.922, together, of course, with the 
definitions necessary to interpret these. There are a number of 
variants of this theory. First: the theory gives us the best possible 
design of predictors and of wave fiiters in the case in which the 
messages and the noises represent the responso of linear resonators 
to Brownian motions; but in much more general cases, they represent 
a possible design for predictors and fiiters. This will not be an 
absolute best possible design, but it will minimize the mean square 
error of prediction and filtering, in so far as this can be done with 
apparatus performing linear operations. However, there will 
generally be some non-linear apparatus which gives a performance 
still better than that of any linear apparatus. 

Next, the time series here have been simple time series, in which a 
single numerical variable dependa on the time. There are also 
múltiple time series, in which a number of such variables depend 
simultaneously on the time; and it is these which are of greatest 
importance in economics, meteorology, and the like. The complete 
weather map of the United States, taken from day to day, constitutes 
such a time series. In this case, we have to develop a number of 
functions simultaneously in terms of the frequency, and the quad- 
ratic quantities such as Eq. 3.35 and the |fc(tu)¡ 2 of the argumenta 
following Eq. 3.70 are replaced by arraye of paira of quantities—that 
is, matrices, The problem of determining k(u>ym terms of |&(tu)| 2 , in 
such a way as to satisfy certain auxiliary conditions in the complex 
plañe, becomes much more difficult, especially as the multiplication 
of matrices is not a permutable operation. However, the problems 
involved in this multidimensional theory have been solved, at least 
in part, by Krein and the author. 

The multidimensional theory representa a complication of the one 
already given. There is another closely related theory which is a 
simplification of it. This is the theory of prediction, filtering, and 
amount of information in discrete time series. Such a series is a 
sequence of functions /«(a) of a parameter a, where n runs over all 
integer valúes from — oo to oo. The quantity a is as before the 
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parameter of distribution, and may be taken to run uniformly over 
(0, 1), The tíme series is said to be in statistical equilibrium when 
the ehange of n to n + v (v an integer) is equivalent to a measure- 
preserving transformaron into itself of the interval (0, 1) over which 
a runs. 

The theory of discrete time series is simpler in many respects 
than the theory of the continuous series. It is much easier, for 
instance, to make them depend on a sequence of independent choices. 
Eaeh term (in the mixing case) will be representable as a combina- 
tion of the previous terms with a quantity independent of all previous 
terms, distributed uniformly over (0, 1), and the sequence of these 
independent factors may be taken to replace the Brownian motion 
which is so important in the continuous case. 

If f n (a) is a time series in statistical equilibrium, and it is metrically 
transitive, its autocorrelation coeñicient will be 


<f>m = /m(a)/o(a) da 

and we shall have 

1 & 

<i>m « lim —- 2 /*+»(«)/*(«) 

A’—»oo T * o 

1 N 

= iim ' Air '' -l. i Z 

A-*oo * * o 

for almost all a. Let us put 


or 


Let 


and let 


Let 



0(w) = 2 

- log #(o») = 2 P* 008 no) 
G(oj) = y + 2 P» etn<a 

e G(u.) = k{ w ) 


(3.923) 

(3.924) 

(3.925) 

(3.926) 

(3.927) 

(3.928) 

(3.929) 
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Then under very general conditions, ¿(a*) will be the boundary 
valué on the unit eircle of a function without zeros or singularities 
inside the unit eircle if w is the angle. We shall have 

|*M| 8 = &{a>) (3.930) 

If now we put for the best linear prediction of/ n (a) with a lead of v 


2fn-A«)W, (3.931) 

we shall find that 

V W = jr y f k[u)e~ i>iU du (3.932) 

o 2rrk{oj) 

This is the analogue of Eq. 3.88. Let us note that if we put 


iL 


k(u)e~ itíU du 


(3.933) 


then 


2 = 


2 k^e illw 



(3.934) 


It will clearly be the result of the way we have formed k{oj) that in 
a very general set of cases we can put 


¿(cu) 




Then Eq. 3.934 becomes 

2 W u e { **" — e~ irw íl - ^ 2 2*®**") 

o \ o o / 

In particular, if v = 1, 

y W u e i,i,u = — Aro y 


(3.935) 


(3.936) 


(3.937) 
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or 

W» = -q Á+ ik 0 (3.938) 

Thus for a prediction one step ahead, fche best valué for /„+i(a) is 


- kí, 2 ?«i/« 4 a ) (3.939) 

0 

and by a procesa of step-by-step prediction, we can solve the entire 
problem of linear prediction for diacrete time series. As in the 
continuous case, this witl be the best prediction possible by any 
method if 

/.(«) = J“ . K(n - t) d£{ T, a) (3.940) 

The transfer of the filtering problem from the continuous to the 
discrete case follows much the same linea of argument. Formula 
3.913 for the frequency characteristic of the best filter takes the form 


_J_ f f' Btaís>+ e ““ du (3 941) 

2 rrk{aj) J-„ k(u) 

where ali the terms receive the same definitions as in the con¬ 
tinuous case, except that all integráis on <o or u are from — n to n 
instead of from — oo to oo and all sums on v are discrete sums 
instead of integráis on t. The filters for discrete time series are 
usually not so much physically constructible devices to be used with 
an electric circuit as mathematical procedures to enable statisticians 
to obtain the best resulta with statistically impure data. 

Finally, the rate of transfer of information by a discrete time 
series of the form 


J”° M(n - r) d((t, y) (3.942) 

in the presence of a noise 

J" N(n - r) di(t, S) (3.943) 

when y and 8 are independent, will be the precise analogue of 
Expression 3.922, namely, 


2t r 


J du log 2 


J M (r)e tUT dr 

2 

+ 

i: 

V(r)e tUT dr 


j JV(r)e‘“' Ít 

2 


(3.944) 
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where over (— ir, ir), 

M{r)e iut ¿r| 2 (3.945) 

represents the power distribution of the message in frequency, and 
JV(r)e‘“' ái| 2 (3.946) 

thafc of the noise. 

The statistical theories we have here developed involve a full 
knowledge of the pasts of the time series we observe. In every case, 
we have to be content with less, as our observation does not run 
indefinitely into the past. The development of our theory beyond 
this point, as a practica! statistical theory, involves an extensión of 
existing methods of sampüng. The author and others have made a 
beginning in this direction. It involves all the complexities of the 
use either of Bayes’ law, on the one hand, or of those terminological 
tricks in the theory of likelihood, 1 on the other, which seem to avoid 
the necessity for the use of Bayes’ law but which in reality transfer 
the responsibility for its use to the working statistician, or the person 
who ultimately employs his resulta. Meanwhile, the statistical 
theorist is quite honestly able to say that he has said nothing which is 
not perfectiy rigorous and unimpeachable. 

Finally, this chapter shouíd end with a discussion of modern 
quantum mechanics. These represent the highest point of the 
invasión of modern physics by the theory of time series. In the 
Newtonian physics, the sequence of physical phenomena is com- 
pletely determined by its past and in particular by the determina- 
tion of all positions and momenta at any one moment. In the 
complete Gibbsian theory, it is still true that with a perfect determina- 
tion of the múltiple time series of the whole universo the knowledge 
of all positions and momenta at any one moment would determine the 
entire future. It is only because these are ignored, non-observed 
coordinates and momenta that the time series with which we actually 
work take on the sort of mixing property with which we have become 
familiar in this chapter, in the case of time series derived from the 
Brownian motion. The great contribution of Heisenberg to physics 
was the replacement of this still quasi-Newtonian world of Gibbs by 
one in which the time series can in no way be reduced to an assembly 
of determínate threads of development in time. In quantum 
mechanics, the whole past of an individual system does not determine 

1 See writings of R. A. Fisher and J. von Neumann. 
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the future of fchat system in any absolute way but merely the dis- 
fcribution of posaible futures of the system. The quantities which the 
elassieal physies demands for a knowledge of the entire course of a 
system are not simultaneously observable, except in a loose and 
approximate way, which nevertheless is sufficiently precise for the 
needs of the classical physies over the range of precisión where it has 
been skoum experimentally to be applicable. The conditions of the 
observation of a momentum and its corresponding position are 
incompatible. To observe the position of a system as precisely as 
possi ble, we must observe it with light or electrón waves or similar 
means of high resolving power, or short wavelength. However, the 
light has a partióle action depending on its frequeney only, and to 
illuminate a body with high-frequeney light means to subject it to a 
change in its momentum which increases with the frequeney. On 
the other hand, it is low-frequency light that gives the mínimum 
change in the momenta of the partióles it iiluminates, and this has not 
a sufficient resolving power to give a sharp indication of positions. 
Intermedíate frequencies of light give a blurred account both of 
positions and of momenta. In general, there is no set of observations 
conceivable which can give us enough information about the past of 
a system to give us complete information as to its future. 

Nevertheless, as in the case of all ensembles of time series, the 
theory of the amount of information which we have here developed 
is applicable, and consequently the theory of entropy. Since, 
however, we now are dealing with time series with the mixing prop- 
erty, even when our data are as complete as they can be, we find that 
our system has no absolute potential barriers, and that in the course 
of time any State of the system can and will transform itself into any 
other state. However, the probability of this depends in the long 
run on the relative probability or measure of the two states. This 
turna out to be especially high for states which can be transformed 
into themselves by a large number of transformations, for states 
which, in the language of the quantum theorist, have a high internal 
resonance, or a high quantum degeneracy. The benzene ring is an 
example of this, since the two states are equivalent, This suggests 
that in a 



system in which various building blocks may combine themselves 
intimately in various ways, as when a mixture of amino acids 



94 


CYBERNETICS 


organizes itself into protein chaina, a situation where many of these 
chaina are alike and go through a stage of cióse association with one 
another may be more stable than one in which they are diñ'erent. 
It was suggested by Haldane, in a tentative manner, that thia may 
be the way in which genes and viruses reproduce themselves; and 
although he has not asserted this suggestion of his with anything 
like finality, I see no cause not to retain it as a tentative hypothesia. 
As Haldane himseif has pointed out, as no single partióle in quantum 
theory has a perfectly sharp individuality, it is not possible in such a 
case to say, with more than fragmentary accuracy, which of the two 
examples of a gene that has reproduced itself in this manner is the 
master pattem and which is the copy. 

This same phenomenon of resonance is known to be very frequently 
represented in living matter. Szent-Gyórgyi has suggested its 
importance in the construction of muscles. Substances with high 
resonance very generally have an abnormal capacity for storing both 
energy and information, and such a storage certainly occurs in 
muscle contraction. 

Again, the same phenomena that are concerned in reproduction 
probably have something to do with the extraordinary specificity of 
the Chemical substances found in a living organism, not only from 
species to species but even within the individuáis of a species. Such 
considerations may be very important in immunology. 



IV 


Feedback and Oscillation 


A patient comes into a neurological clinic. He is not paralyzed, 
and he can move his legs when he receives the order. Nevertheless, 
he suffers under a severe disability. He walks with a peculiar 
uncertain gait, with eyes downcast on the ground and on his legs. 
He starts each step with a kick, throwing each leg in succession in 
front of him. If blindfolded, he cannot stand up, and totters to the 
ground. What is the matter with him? 

Another patient comes in. While he sits at rest in his chair, there 
seems to be nothing wrong with him. However, offer him a cigarette, 
and he will swing his hand past it in trying to pick it up. Thia will 
be followed by an equally futile swing in the other direction, and this 
by stiil a third swing back, until his motion becomes nothing but a 
futile and violent oscillation. Give him a glass of water, and he will 
empty it in these swings before he is able to bring it to his mouth. 
What is the matter with him? 

Both of these patients are suffering from one form or another of 
what is known as ataxia. Their muscles are strong and healthy 
enough, but they are unable to organize their actions. The first 
patient suffers from tabes dorsalis. The part of the spinal cord 
which ordinariíy receives sensations has been damaged or destroyed 
by the late sequelae of syphilis. The incoming messages are blunted, 
if they have not totally disappeared. The receptora in the joints 
and tendons and muscles and the soles of his feet, which ordinariíy 
convey to him the position and state of motion of his legs, send no 
messages which his central nervous system can pick up and transmit, 
and for information concerning his posture he is obliged to trust to his 
95 
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eyes and the balancing organs of his inner ear. In the j argón of the 
physiologist, he has lost an important part of his proprioceptive or 
kinesthetic sense. 

The second patient has lost none of his proprioceptive sense. His 
injury is eisewhere, in the cerebellum, and he is suffering from what is 
known as a eerebellar tremor or purpose tremor. It seems iikeiy 
that the cerebellum has some function of proportioning the muscular 
response to the proprioceptive input, and if this proportioning is 
disturbed, a tremor may be one of the results. 

We thus see that for effective action on the outer world it is not 
only essential that we possess good effectors, but that the performance 
of these effectors be properly monitored back to the central nervous 
system, and that the readings of these monitors be properly combined 
with the other information coming in from the sense organs to 
produce a properly proportioned output to the efFectors. Something 
quite similar is the case in mechanical systems. Let us consider a 
signa! tower on a railroad. The signalman Controls a number of 
levers which turn the semaphore signáis on or off and which regúlate 
the setting of the switches. However, it does not do for him to 
assume blindly that the signáis and the switches have followed his 
orders. It may be that the switches have frozen fast, or that the 
weight of a load of snow has bent the signal arms, and that what he 
has supposed to be the actual State of the switches and the signáis— 
his effectors—does not correspond to the orders he has given. To 
avoid the dangers inherent in this contingency, every effector, switch 
or signal, is attached to a telltale back in the signal tower, which 
conveys to the signalman its actual States and performance. This is 
the mechanical equivalent of the repeating of orders in the navy, 
according to a code by which every subordínate, upon the reception 
of an order, must repeat it back to his superior, to show that he has 
heard and understood it. It is on such repeated orders that the 
signalman must act. 

Notice that in this system there is a human link in the chain of 
the transmission and return of information: in what we shall from 
now on cali the chain of feedback. It is true that the signalman is 
not altogether a free agent; that his switches and signáis are inter- 
locked, either mechanically or electrically, and that he is not free to 
choose some of the more disastrous combinations. There are, 
however, feedback chains in which no human element intervenes. 
The ordinary thermostat by which we regúlate the heating of a house 
is one of these. There is a setting for the desired room temperature; 
and if the actual temperature of the house is beiow this, an apparatus 
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is actuated which opens the damper, or increases the flow of fuel oil, 
and brings the temperature of the house up to the desired level. If, 
on the other hand, the temperature of the house exceeds the desired 
level, the dampers are turned off or the flow of fuel oil is slackened or 
interrupted. In this way the temperature of the house is kept 
approximately at a steady level. Note that the constancy of this 
level depends on the good design of the thermostat, and that a badly 
designed thermostat may send the temperature of the house into 
violent oscillations not unlike the motions of the man suffering from 
cerebellar tremor. 

Another example of a purely mechanical feedback system—the 
one originaily treated by Clerk Maxwell—is that of the govemor of a 
steam engine, which serves to regúlate its velocity under varying 
eonditions of load. In the original form designed by Watt, it 
consists of two balls attached to pendulum rods and swinging on 
opposite sides of a rotating shaft. They are kept down by their own 
weight or by a spring, and they are swung upward by a centrifugal 
action dependent on the angular velocity of the shaft. They thus 
assume a compromiso position likewise dependent on the angular 
velocity. This position is transmitted by other rods to a collar 
about the shaft, which actuates a member which serves to open the 
intake valves of the cylinder when the engine slows down and the 
balls fall, and to cióse them when the engine speeds up and the balls 
rise. Notice that the feedback tends to oppose what the system is 
already doing, and is thus negative. 

We have thus examples of negative feedbacks to stabilize tempera- 
ture and negative feedbacks to stabilize velocity. There are also 
negative feedbacks to stabilize position, as in the case of the steering 
engines of a ship, which are actuated by the angular difference 
between the position of the wheel and the position of the rudder, and 
always act so as to bring the position of the rudder into accord with 
that of the wheel. The feedback of voluntary activity is of this 
nature. We do not will the motions of certain muscles, and indeed 
we generally do not know which muscles are to be moved to accom- 
plish a given task; we will, say, to pick up a cigarette. Our mofcion is 
regulated by some measure of the amount by which it has not yet 
been accomplished. 

The information fed back to the control center tends to oppose the 
departure of the controlled from the controlling quantity, but it may 
depend in widely different ways on this departure. The simplest 
control systems are linear: the output of the effector is a linear 
expression in the input, and when we add inputs, we also add outputs. 
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The output is read by some apparatus equally linear. This reading 
is simply subtracted from the input. We wish to give a precise 
theory of the performance of such a piece of apparatus, and, in 
particular, of its defective behavior and its breaking into oscillation 
when it is mishandled or overloaded. 

In this book, we ha ve avoided mathematical symbolism and 
mathematical technique as far as possible, although we have been 
forced to compromise with them in various places, and in particular 
in the previous chapter. Here, too, in the rest of the present 
chapter, we are dealing precisely with those matters for which the 
symbolism of mathematics is the appropriate language, and we can 
avoid it only by long periphrases which are scarcely inteiligibie to the 
layman, and which are inteiligibie only to the reader acquainted with 
mathematical symbolism by virtue of his ability to transíate them 
into this symbolism. The best compromise we can make is to 
supplement the symbolism by an ampie verbal explanation. 

Let f(t) be a function of the time t where t runs from - oo to oo; 
that is, let /(/) be a quantity assuming a numerical valué for each 
time t. At any time /, the quantities f(s) are accessible to us when s is 
less than or equal to t but not when s is greater than t. There are 
pieces of apparatus, electrical and mechanical, which delay their 
input by a fixed time, and these yield us, for an input/(<), an output 
f{t — r), where r is the fixed delay. 

We may combine several pieces of apparatus of this kind, yielding 
usoutputs/(í - ti ),f(t - tí), • • - ,f(t - r„). We can multiply each 
of these outputs by fixed quantities, positivo or negative. For 
example, we may use a potentiometer to multiply a voltage by a 
fixed positivo number less than 1, and it is not too difficult to devise 
automatic balancing devices and amplifiers to multiply a voltage by 
quantities which are negative or are greater than 1. It is also not 
difficult to construct simple wiring diagrama of circuits by which we 
can add voltages continuously, and with the aid of these we may 
obtain an output 

í«l/(l-r.) (4.01) 

1 

By increasing the number of delays r¿ and suitably adjusting the 
coefficients a*, we may approximate as closely as we wish to an 
output of the form 



In this expression, it is important to realize that the fact that we 
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are integrating from 0 to oo, and nofc from — oo to oo, is essential. 
Otherwise we could use various praetical devices to opérate on this 
result and to obtain f{t + a), where a is positive. This, however, 
involves the knowiedge of the future of f(t }; and f(t) may be a 
quantity, like the coordinates of a streetcar which may turn off one 
way or the other at a switch, which is not determined by its past. 
When a physical procese seems to yield us an operator which con- 
verts f(t) to 


— t) dr 


(4.03) 


where a(r) does not effectiveiy vanish for negative valúes of r, it 
means that we have no longer a true operator on/(f), determined 
uniquely by its past. There are physical cases where this may 
occur. For example, a dynamical system with no inpufc may go into 
permanent oscillation, or even oscillation building up to infinity, 
with an undetermined amplitude. In such a case, the future of the 
system is not determined by the past, and we may in appearance 
find a formalism which suggests an operator dependent on the future. 

The operation by which we obtain Expression 4.02 from/(í) has 
two important further properties: (1) it is independent of a shift of 
the origin of time, and (2) it is linear. The first property is expressed 
by the statement that if 


g(t) - J" «(,)/(/ - T)dr (4.04) 

then 

g(f + a) = cc(r)f (t + o — t) dr (4.06) 

The second property is expressed by the statement that if 

g(t) « AMt) + Bf z (t) (4.06) 

then 

J q a{r)g(t - r) dr 

— A j* ^(t)/i(Í — r) dr + B j a{r)fi{t — r)dr (4.07) 

It may be shown that in an appropriate sense every operator on the 
past of f(t) which is linear and is invariant under a shift of the origin 
of time is either of theform of Expression 4.02 or is a limit of a seqúense 
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of operators of that form. For example, f’(t) is the result of an 
operator with these properties when applied fco f(t ), and 


where 


/'(<) = lim f - r)dr 

<-*o Jo e* w 

f 1 0 < x < 1 

a(x) — < — 1 1 < a: < 2 

[O 2 x 


(4.08) 


(4.09) 


As we have seen before, the functions e zt are a set of functions / (t ) 
which are particularly important from the point of view of Operator 
4.02, since 

g*(t-T) _ e zt. e ~ ir (4.10) 


and the delay operator becomes merely a multiplier dependent on 
z. Thus Operator 4.02 becomes 

e zt j a(r)e~ z * dr (4.11) 

and is also a multiplication operator dependent on z only. The 
expression 

J”° a(r)e~<'dr = A(z) (4.12) 

is said to be the representation of Operator 4.02 as a function of 
frequency. If z is taken as the compiex quantity x -t- iy, where x 
and y are real, this becomes 


a^e-^e- 1 ^ dr 


(4.13) 


so that by the well-known Schwarz inequality eoncerning integráis, 
if y > 0 and 


we have 


J \a(r)\ 2 dr < oo 

j A(x + iy) J < j^J* \a{r)\ 2 dr J* e~ 2xr dr j 
JjaWP 


ix j 


l ! 2 dr 


(4.14) 


(4.1S) 
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This means that A(x + iy) is a bounded holomorphic ñmction of 
a complex variable in every half-plane x > c > 0, and that the func- 
tion A (iy) represente in a certain very definite sense the boundary 
valúes of sueh a function. 

Let us put 

u + iv = A(x + iy) (4.16) 

where u and v are real. The x + iy will be determined as a function 
(not necessarily single-valued) of u + iv. This function will be 
analytic, though meromorphic, except at the points u + iv corre- 
sponding fco points z = x + iy, where dA(z)fdz = 0. The bound¬ 
ary x — 0 will go into the curve with the parametric equation 

u + iv — A(iy) (y real) (4.17) 

This new curve may intersect itself any number of times. In 
general, however, it will divide the plañe into two regions. Let us 
consider the curve (Eq. 4.17) traced in the direction in which y goes 
from -oo to oo. Then if we depart from Eq. 4.17 to the right and 
follow a continuous course not again cutting Eq. 4.17, we may 
arrive at certain points. The points which are neither in this set 
ñor on Eq. 4.17 we shall cali exterior points. The part of the curve 
(Eq. 4.17) which contains limit points of the exterior points we shall 
cali the effective boundary. All other points will be termed interior 
points. Thus in the diagram of Fig. 1, with the boundary drawn in 
the sense of the arrow, the interior points are shaded and the effective 
boundary is drawn heavily. 



Fio. l. 


The condition that A be bounded in any right half-plane will then 
tell us that the point at infinity cannot be an interior point. It may be 
a boundary point, although there are certain very definite restric- 
tions on the character of the type of boundary point it may be. 
These concern the “thiekness” of the set of interior points reaching 
out to infinity. 

Now we come to the problem of the mathematical expression of 
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the problera of linear feedbaek. Let the control flow chart— not the 
wiring diagrara—of such a system be as shown in Fig. 2. Here 


Input 

X 




K - XA y 

Motor 
operator A 



AY 


Multiplier 
operator X 


jxAY 


Fio. 2. 


the input of the motor is Y, which is the difference between the 
original input X and the output of the multiplier, which multiplies 
the power output A Y of the motor by the factor A. Thus 


Y - X - XA Y 

(4.18) 

and 


Y X 

1 4- A A 

(4.19) 

so that the motor output is 


AY = X , A x . 

1 + A A 

(4.20) 


The operator produced by the whole feedbaek mechanism is then 
A ¡(I + A^4). This iviU be infinite when and only vahen A — —l/A. 
The diagram (Eq. 4.17) for this new operator will be 


u + iv — 


A(iy) 

1 + ^A{iy) 


(4.21) 


and oo will be an interior point of this when and only when — 1/A is an 
interior point of Eq. 4.17. 

In this case, a feedbaek with a multiplier A will certainly produce 
something eatastrophic, and as a matter of fact the catastrophe will 
be that the system will go into unrestrained and increasing oscillation. 
If, on the other hand, the point — 1/A is an exterior point, it may be 
shown that there will be no difficulty, and the feedbaek is atable. 
If — 1/A is on the effective boundary, a more elabórate discussion is 
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neeessary. Under most circumsfcances, the system may go into an 
oscillation of an amplitude which does not increase. 

It is perhaps worth considering several operators A and the ranges 
of feedback which are admissible under them. We shall consider 
not only the operations of Expression 4.02 but also their limits, 
assuming that the same argument will apply to these. 

If the operator A corresponda to the differentiai operator, A(z) — 2 , 
as y goes from — 00 to 00 , A(y) does the same, and the interior points 
are the points interior to the right half-piane. The point - 1/A is 
always an exterior point, and any amount of feedback is possible. If 


A(z) = • 


1 


the curve (Eq. 4.17) is 


+ kz 
1 


1 + kiy 

1_ _ -ky 


which we may write 


1 + jfc*y* 1 + Wyl 


w 2 + v 2 = u 


(4.22) 


(4.23) 

(4.24) 

(4.25) 


This is a circle with radius 1/2, and center at (1/2, 0). It is de- 
scribed in the clockwise sense, and the interior points are those 
which we should ordinarüy consider interior. In this case too, the 
admissible feedback is unlimited, as — 1/A is always outside the circle. 
The a(í) corresponding to this operator is 


Again, let 
Then Eq. 4.17 is 

and 


a(t) — e~*/ k jk 


U + iv = (q - Í7-7-) 

\1 + kiy! 

I - ¿y 


a _ (I - kiy)‘ 

(1 + *V)« 


(1 +' 


(1 + ‘ 


(4.26) 

(4.27) 


(4.28) 

(4.29) 


w* 4- V 2 


1 

(1 + khj‘)* 


This yields 


(4.30) 
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(tt 2 + V 2 )2 k 


(4.31) 


Then 

u — (m 2 + v 2 ) \ 1 


kW 


(m 2 + v 2 ) — ■ 


4 fc 2 (« 2 + i? 2 ) 2 J ' ' 4(m 2 + v 2 ) 

In polar coordinates, if u — p eos <f>,v = p sin <f>, this becomes 


That is, 


(4.32) 


sin 2 ^ 1 coa 2 (f) 

- —= ? -¡ + — 

(4.33) 

eos <¿ 1 

2 = ± 2 

(4.34) 

O 

= Q- 

.s 

(4.35) 


It can be shown that these two equations represent only one curve, 
a cardioid with vertex at the origin and cusp pointing to the right. 
The interior of this curve will contain no point of the negative real 
axis, and, as in the previous case, the admissible amplification is 
unlimited. Here the operator a(t) is 

= <436) 

Let 

w - (rTT,) 3 < 4 - 37 > 

Let p and tf> be defined as in the last case. Then 

'*“3 + i P %án Í = TT¡kTy (4 - 38) 

As in the first case, this wili give us 

p* eos 2 ^ sin 2 ~ — p' Á eos ^ (4.39) 

That is, 


P h = cos- 


(4.40) 
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which is a curve of the shape of Fig. 3. The shaded región representa 



Fig. 3. 


the interior points. All feedback with coefficient exceeding 1/8 is 
impossible. The corresponding a{t) is 

<H‘) = («i) 

Finally, let our operator corresponding to A be a simple deiay of 
T units of time. Then 

A(z) = e~ Ti (4.42) 

Then 

u + iv = er Ti v = eos Ty — i sin Ty (4.43) 

The curve (Eq. 4.17) will be the unifc circle about the origin, 
described in a clockwise sense about the origin with unit velocity. 
The inside of this curve will be the inside in the ordinary sense, and 
the limit of feedback intensity will be 1. 

There is one very interesting conclusión to be drawn from this. 
It is possible to compénsate for the operator 1/(1 + kz) by an 
arbitrarily heavy feedback, which will give us an .4/(1 + A^4) as near 
to 1 as we wish for as large a frequeney range as we wish. It is 
thus possible to compénsate for three successive operators of this 
sort by three—or even two—successive feedbacks. It is not, 
however, possible to compénsate as closely as we wish for an operator 


106 


CYBERNETICS 


1/(1 + kz) 3 , which Í8 the resultan t of the composition of three 
operators 1/(1 4- kz) incascade, by a single feedback. The operafcor 
1/(1 + fe) 3 may also be written 


J_ ¿a 1 
2 k 2 dz 2 1 4 - kz 


(4.44) 


and may be regarded as the limit of the additive composition of three 
operators with first-degree denominators. It thus appears that a 
sum of different operators, each of which may be compensated as well 
as we wish by a single feedback, cannot itself be so compensated. 

In the important book of MacColl, we ha ve an example of a 
complicated system which can be stabilized by two feedbacks but 
not by one. It concems the steering of a ship by a gyrocompass. 
The angle between the course set by the quartermaster and that 
shown by the compass expresses itself in the tuming of the rudder, 
which, in view of the headway of the ship, produces a tuming 
moment which serves to change the course of the ship in such a way 
as to decrease the difference between the set course and the actual 
course. If this is done by a direct opening of the valves of one 
steering engine and closing of the valves of the other in such a way 
that the tuming velocity of the rudder is proportionai to the devia- 
tion of the ship from this course, let us note that the angular position 
of the rudder is roughly proportionai to the tuming moment of the 
ship and thus to its angular acceleration. Henee the amount of 
turning of the ship is proportionai with a negative factor to the third 
derivative of the deviation from the course, and the operation which 
we have to stabilize by the feedback from the gyrocompass is kz 3 , 
where k is positive. We thus get for our curve (Eq. 4.17) 


u. + iv — —kiy 3 


(4.45) 


and, as the left half-plane is the interior región, no servomechanism 
whatever will stabilize the system. 

In this account, we have slightly oversimplified the steering 
problem. Actually there is a certain amount of friction, and the 
forcé turning the ship does not determine the acceleration. Instead, 
if 6 is the angular position of the ship and <f> that of the rudder with 
respect to the ship, we have 


d 2 6 


de 


dt ¿" Ci * - 02 a 


(4.46) 


and 


u 4- iv = —k\iy 3 — kiy { 


(4.47) 
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This curve may be written 

t> 2 — —k 3 u 3 (4.48) 

which still cannot be stabilized by any feedback. As y goes frora 
— oo to oo, u goes from oo to — oo, and the inside of the curve is to 
the left. 

If, on the ofcher hand, the position of the rudder is proportional to 
the deviation of the course, the operator to be stabilized by feedback 
is kiz 2 + kzz, and Eq. 4.17 becomes 

u + iv = —k\y 2 -(- k&y (4.49) 

This curve may be written 

v 2 *= —kzu (4.50) 

but in this case, as y goes from — oo to oo, so does v, and the curve is 
described from y = — oo to y = oo. In this case, the outside of the 
curve is to the left, and unlimited amount of amplification is possible. 

To achieve this we may employ another stage of feedback. If 
we regúlate the position of the valves of the steering engine, not by 
the discrepancy between the actual and the desired course but by the 
dijference between this quantity and the angular position of the 
rudder, we shall keep the angular position of the rudder as nearly 
proportional to the ship’s deviation from true course as we wish, if 
we allow a large enough feedback—that is, if we open the valves 
wide enough. This double feedback system of control is in fact the 
one usually adopted for the automatic steering of ships by means of 
the gyrocompass. 

In the human body, the motion of a hand or a finger involves a 
system with a large number of joints. The output is an additive 
vectorial combination of the outputs of all these joints. We have 
seen that, in general, a complex additive system like this cannot be 
stabilized by a single feedback. Correspondingly, the voluntary 
feedback by which we regúlate the performance of a task through the 
observation of the amount by which it is not yet accompiished needs 
the backing up of other feedbacks. These we cali postural feed- 
backs, and they are associated with the general maintenance of tone 
of the muscular system. It is the voluntary feedback which shows 
a tendency to break down or become deranged in cases of cerebellar 
injury, for the ensuing tremor does not appear unless the patient 
tries to perform a voluntary task. This purpose tremor, in which 
the patient cannot pick up a glass of water without upsetting it, is 
very different in nature from the tremor of Parkinsonism, or 
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paralysis agitana, which appears in its most typical form when the 
patient is at rest, and indeed often seems to be greatly mitigated 
w hen he attempts to perform a specific task. There are surgeons 
with Parkinsonism who manage to opérate quite efficientiy. 
Parkinsonism is known not to have its origin in a diseased condition 
of the cerebellum, but to be associated with a pathological focus 
somewhere in the brain stem. It is only one of the diseases of the 
postural feedbacks, and many of these must have their origin in 
defects of parts of the nervous system situated very differently. One 
of the great tasks of physioiogical cybemetics is to disentangle and 
isolate loci of the different parts of this complex of voluntary and 
postural feedbacks. Exampies of eomponent reflexes of this sort 
are the scratch and the waiking reflex. 

When feedback is possible and stable, its advantage, as we have 
already said, is to make performance less dependent on the load. 
Let us consider that the load changes the characteristic A by dA. 
The fractional change will be dA/A. If the operator after feedback 
is 


we shall have 

dB 

B 


B = 


A 

C + A 



C 

dA C 

C_ A A + C 


(4.51) 


(4.52) 


Thus feedback serves to diminish the dependence of the system on 
the characteristic of the motor, and serves to stabilize it, for all 
frequencies for which 


A + C 
C 


> 1 


(4.53) 


This is to say that the entire boundary between interior and exterior 
points must lie inside the circle of radius C about the point —C. 
This will not even be true in the first of the cases we have discussed. 
The effect of a heavy negative feedback, if it is at all stable, will be 
to increase the stability of the system for low frequencies, but 
generally at the expense of its stability for some high frequencies. 
There are many cases in which even this degree of stabiiization is 
advantageous. 

A very important question which arises in connection with oscilla- 
tions due to an excessive amount of feedback is that of the frequency 
of incipient oscillation. This is determined by the valué of y in the 
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iy eorresponding to the point of the boundary of the inside and 
outside regions of Eq. 4.17 lying furthest to the left on the negative 
«-axis. The quantity y is of course of the nature of a frequency. 

We have now come to the end of an elementary discussion of 
linear oscillations, studied from the point of view of feedback. A 
linear oscillating system has certain very special properties which 
characterize its oscillations. One is that when it oscillates, it always 
can and very generally—in the absence of independent simultaneous 
oscillations —does oscillate in the form 

A sin (Bt + C)e Dt (4.54) 

The existence of a periodic non-sinusoidal oscillation is always a 
suggestion at least that the variable observed is one in which the 
system is not linear. In some cases, but in very few, the system may 
be rendered linear again by a new choice of the independent variable. 

Another very significant difference between linear and non- 
linear oscillations is that in the first the amplitude of oscillation is 
completely independent of the frequency; while in the latter, there is 
generally only one amplitude, or at most a discrete set of amplitudes, 
for which the system wiíl oscillate at a given frequency, as well as a 
discrete set of frequencies for which the system will oscillate. This 
is well illustrated by the study of what happens in an organ pipe. 
There are two theories of the organ pipe—a cruder linear theory, and 
a more precise non-linear theory. In the first, the organ pipe is 
treated as a conservative system. No question is asked about how 
the pipe carne to oscillate, and the level of oscillation is completely 
indeterminate. In the second theory, the oscillation of the organ pipe 
is considered as dissipating energy, and this energy is considered to 
have its origin in the stream of air across the lip of the pipe. There 
is indeed a theoretical steady-state flow of air across the lip of the 
pipe which does not interchange any energy with any of the modes of 
oscillation of the pipe, but for certain velocities of air flow this 
steady-state condition is unstable. The slightest chance deviation 
from it will introduce an energy input into one or more of the natural 
modes of linear oscillation of the pipe; and up to a certain point, 
this motion will actually increase the coupling of the proper modes 
of oscillation of the pipe with the energy input. The rate of energy 
input and the rate of energy output by thermal dissipation and 
otherwise have different laws of growth, but, to arrive at a steady 
State of oscillation, these two quantities must be identical. Thus 
the level of the non-linear oscillation is determined just as definitely 
as its frequency. 
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The case we ha ve examined is an example of what is known as a 
relaxation oscillation: a case, that is, where a system of equations 
invariant under a translation in time leads to a soiution periodic—or 
corresponding to some generalized notion of periodicity—in time, 
and determínate in amplitude and frequency but not in phase. In 
the case we have discussed, the frequency of oscillation of the system 
is cióse to that of some loosely coupled, nearly linear part of the 
system. B. van der Pol, one of the chief authorities on relaxation 
oscillations, has pointed out that this is not always the case, and that 
there are in fact relaxation oscillations where the predominating 
frequency is not near the frequency of linear oscillation of any part 
of the system. An example is given by a stream of gas flowing into 
a chamber open to the air and in which a pilot light is burning: when 
the concentration of gas in the air reaches a certain critical valué, 
the system is ready to explode under ignition by the pilot light, and 
the time it takes for this to happen depends only on the rate of flow of 
the coai gas, the rate at which air seeps in and the products of 
combustión seep out, and the percentage composition of an explosive 
mixture of coal gas and air. 

In general, non-linear systems of equations are hard to solve. 
There is, however, a specially tractable case, in which the system 
differs only slightly from a linear system, and the terms which 
distinguish it change so siowly that they may be considered sub- 
stantially constant over a period of oscillation. In this case, we 
may study the non-linear system as if it were a linear system with 
siowly varying parameters. Systems which may be studied this 
way are said to be perturbed secularly, and the theory of secularly 
perturbed systems plays a most important role in gravitational 
astronomy. 

It is quite possible that some of the physiological tremors may be 
treated somewhat roughly as secularly perturbed linear systems. 
We can see quite clearly in such a system why the steady-state 
amplitude level may be just as determínate as the frequency. Let 
one element in such a system be an amplifier whose gain decreases as 
some long-time average of the input of such a system increases. 
Then as the oscillation of the system builds up, the gain may be 
reduced until a State of equilibrium is reached. 

Non-linear systems of relaxation oscillations have been studied in 
some cases by methods developed by Hill and Poincaré . 1 The 
classical cases for the study of such oscillations are those in which 

1 Poincaré, H., Les Métkodes Nouvelles de la Mécanique Céleste, Gauthier-Villars 
©t fila, Paria, 1892-1899. 
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the equations of the systems are of a different nafcure; especially 
where these differenfcial equations are of low order. There is not, 
as far as I know, any comparable adequate study of the corresponding 
integral equations when the system dependa for its future behavior on 
its entire past behavior. However, it is not hard to sketch out the 
form such a theory should take, especially when we are looking only 
for periodic Solutions. In this case, the slight modification of the 
constants of the equation should iead to a slight, and therefore nearly 
linear, modification of the equations of motion. For example, let 
Op[f(t )] be a function of t which resulta from a non-linear operation 
on f(t), and which is affected by a translation. Then the variation 
of Op[f(t)], S Op[f(t)] corresponding to a variational change S f(t) in 
f(t) and a known change in the dynamics of the system, is linear but 
not homogeneous in 8/(f), though not linear in f(t). If we now 
know a solution/(í) of 

<W( t)] = 0 (4.55) 

and we change the dynamics of the system, we obtain a linear non- 
homogeneous equation for §/(<)• If 

(4-66) 

and/(í) + S/(í) is also periodic, being of the form 

/(() + S/(¡) = f(a„ + (4.87) 

then 

8/(í) = 2 S«ne WB< + 2 *ne iKnt inh)d (4.58) 

The linear equations for 8 f(t) will have all coefficients developable 
into series in e iAnt , since f(t) can itself be developed in this form. 
We shall thus obtain an infinite system of linear non-homogeneous 
equations in 8a n + a n , SA, and A, and this system of equations may 
be solvable by the methods of Hill. In this case, it is at least con- 
ceivable that by starting with a linear equation (non-homogeneous) 
and gradually shifting the constraints we may arrive at a solution 
of a very general type of non-linear probiem in relaxation oscillations. 
This work, however, lies in the future. 

To a certain extent, the feedback systems of control discussed in 
this chapter and the compensation systems discussed in the previous 
one are competitors. They both serve to bring the complicated 
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input-output relations of an eífector into a forra approaching a simple 
proportionality. The feedback system, as we ha ve seen, does more 
than this, and has a performance relatively independent of the 
characteristic and changes of characteristic of the efFector used. 
The relafcive usefuiness of the two methods of control thus depends on 
the constancy of the characteristic of the eífector. It is natural to 
suppose that cases arise in which it is advantageous to combine the 
two methods. There are various ways of doing this. One of the 
most simple is that illustrated in the diagram of Fig. 4. 



Output 


Fio. 4. 

In this, the entire feedback system may be regarded as a larger 
efFector, and no new point arises, except that the compensator must 
be arranged to compénsate what is in some sense the average 
characteristic of the feedback system. Another type of arrangement 
is shown in Fig. 5. 



Fio. 5. 


Here the compensator and efFector are combined into one larger 
efFector. This change will in general alter the máximum feedback 
admissible, and it is not easy to see how it can ordinarily be made to 
increase that level to an important extent. On the other hand, for 
the same feedback level, it will most definitely improve the per¬ 
formance of the system. If, for example, the efFector has an essen- 
tially lagging characteristic, the compensator will be an anticipator 
or predictor, designed for its statistical ensemble of inputs. Our 
feedback, which we may cali an anticipatory feedback, will tend to 
hurry up the action of the efFector mechanism. 
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Feedbacks of this general type are certainly found in human and 
animal refíexes. When we go duck shooting, the error which we try 
to minimize is not that between the position of the gun and the actual 
position of the target but that between the position of the gun and 
the anticipated position of the target. Any system of anti-aircraft 
fire control must meet the same problem. The conditions of stability 
and effectiveness of anticipatory feedbacks need a more thorough 
discussion than they have yet received. 

Another interesting variant of feedback systems is found in the 
way in which we steer a car on an icy road. Our entire conduct of 
driving dependa on a knowledge of the slipperiness of the road 
surface, that is, on a knowledge of the performance characteristics 
of the system car-road. If we wait to find this out by the ordinary 
performance of the system, we shall discover ourselves in a skid 
before we know it. We thus give to the steering wheel a succession 
of small, fast impulses, not enough to throw the car into a major 
skid but quite enough to report to our kinesthetic sense whether the 
car is in danger of skidding, and we regúlate our method of steering 
accordingly. 

This method of control, which we may cali control by informative 
feedback , is not difficult to schematize into a mechanical form and 
may well be worth while employing in practice. We have a com- 
pensator for our efFector, and this compensator has a characteristic 
which may be varied from outside. We superimpose on the in~ 
coming message a weak high-frequency input and take off the output 
of the effector a partial output of the same high frequency, separated 
from the rest of the output by an appropriate filter. We explore 
the amplitude-phase relations of the high-frequency output to the 
input in order to obtain the performance characteristics of the 
effector. On the basis of this, we modify in the appropriate sense 
the characteristics of the compensator. The flow chart of the system 
is much as in the diagram of Fig. 6. 

The advantages of this type of feedback are that the compensator 
may be adjusted to give stability for every type of constant load; 
and that, if the characteristic of the load changes slowly enough, in 
what we have called a secular manner, in comparison with the changes 
of the original input, and if the reading of the load condition is 
accurate, the system has no tendency to go into oscillation. There 
are very many cases where the change of load is secular in this 
manner. For example, the frietional load of a gun turret depends on 
the stiffness of the grease, and this again on the temperature; but this 
stiffness will not change appreciably in a few swings of the turret. 
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Of course, this informative feedback will work well only if the 
characteristics of the load at high frequencies are the same as, or 
give a good indication of, its characteristics at low frequencies. 


Original 



Fia. 0. 

This will often be the case if the eharacter of the load, and henee of 
the eífector, contains a relatively smali number of variable para- 
meters. 

This informative feedback and the examples we have given of 
feedback with compensators are only particular cases of what is a 
very complicated theory, and a theory as yet imperfectly studied. 
The whole field is undergoing a very rapid development. It deserves 
much more attention in the near future. 

Before we end this chapter, we must not forget another important 
physiological application of the principie of feedback. A great 
group of cases in which some sort of feedback is not only exemplified 
in physiological phenomena but is absoíutely essential for the con- 
tinuation of life is found in what is known as komeostasis. The 
conditions under which life, especially healthy life, can continué in 
the higher animáis are quite narrow. A variation of one-half 
degree centigrade in the body temperature is generally a sign of 
illness, and a permanent variation of five degrees is scarcely consistent 
with Ufe. The osmotic pressure of the biood and its hydrogen-ion 
concentration must be held within strict limits. The waste produets 
of the body must be excreted before they rise to toxic concentrations. 
Beside all these, our leucocytes and our Chemical defenses against 
infection must be kept at adequate levels; our heart rate and biood 
pressure must neither be too high ñor too low; our sex eyele must 
conform to the racial needs of reproduction; our calcium metabolism 
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must be such as neither to soften our bones ñor to caicify our tissues: 
and so on. In short, our inner eeonomy must contai n an assembly 
of thermostats, automatic hydrogen-ion-concentration Controls, 
governors, and the iike, which would be adequate for a great Chemical 
planfc. These are what we know collectively as our homeostatic 
mechanism. 

Our homeostatic feedbacks have one general difference from our 
voiuntary and our postural feedbacks: they tend to be slower. 
There are very few changos in physiological homeostasis—not even 
cerebral anemia—that produce serious or permanent damage in a 
small fraction of a second. Accordingly, the nerve fibers reserved 
for the processes of homeostasis—the sympathetic and para- 
sympathetic systems—are often non-myelinated and are known to 
have a considerably slower rate of transmission than the myelinated 
fibers. The typical effectors of homeostasis—smooth muscles and 
glands—are likewise slow in their action compared with striped 
muscles, the typical effectors of voiuntary and postural activity. 
Many of the messages of the homeostatic system arq carried by non- 
nervous channels—the direct anastomosis of the muscular fibers of 
the heart, or Chemical messengers such as the hormones, the carbón 
dioxide content of the blood, etc.; and, except in the case of the heart 
muscle, these too are generally slower modes of transmission than 
myelinated nerve fibers. 

Any complete textbook on cybernetics should contain a thorough 
detailed discussion of homeostatic processes, many individual cases 
of which have been discussed in the literature with some detail. 1 
However, this book is rather an introduction to the subject than a 
compendious treatise, and the theory of homeostatic processes 
involves rather too detailed a knowledge of general physiology to be 
in place here. 

1 Cannon, W., The Wisdom of the Body, W. W. Norton & Company, Inc., New 
York, 1932; Henderson, L. J., The Fitneea of the Environment, The Macmillan Com¬ 
pany, New York, 1913. 



V 
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Computing machines are essentially machines for recording 
numbers, operating with numbers, and giving the result in numerical 
form. A very considerable part of their cost, both in money and in 
the effort of construction, goes to the simple problem of recording 
numbers clearly and accurately. The simplest mode of doing this 
seems to be on a uniform scale, with a pointer of some sort moving 
over this. If we wish to record a number with an accuracy of one 
part in n, we have to assure that in each región of the scale the pointer 
assumes the desired position within this accuracy. That is, for an 
amount of information log 2 n, we must finish each part of the 
movement of the pointer with this degree of accuracy, and the cost 
will be of the form An, where A is not too far from a constant. 
More precisely, since if n — 1 regions are accurately estabiished, 
the remaining región will also be determined accurately, the cost of 
recording an amount of information I will be about 

(2 1 - 1 )A (5.01) 

Now let us divide this information over two scales, each marked less 
accurately. The cost of recording this information will be about 

2(2 //2 - 1 )A (5.02) 

If the information be divided among N scales, the approximate cost 
will be 

N(2 I/N - l)A 
116 


(5.03) 
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This will be a miniraum when 

2 >t« - 1 = í 2 m log 2 (5.04) 

or if we put 

í log 2 = x (5.05) 

when 

x = - = 1 — e~ x (5,06) 

This will occur when and only when x — 0, or N = co. That is, N 
should be as large as possible to give the iowest cost for the storage 
of Information. Let us remember that 2 I/N must be an integer, and 
that 1 is not a significant valué, as we then have an infinite number 
of scales each containing no information. The best significant valué 
for 2 / / JV is 2, in which case we record our number on a number of 
independent scales, each divided into two equal parts. In other 
words, we represent our numbers in the binary system on a number 
of scales in which all that we know is that a certain quantity lies in 
one or the other of two equal portions of the scale, and in which the 
probability of an imperfect knowledge as to which half of the scale 
contains the observation is made vanishingly small. In other words, 
we represent a number v in the form 

v - vo + - vi + ~vz + • • • ■¥ — v n + • • • (5.07) 

where every v n is either 1 or 0. 

There exist at present two great types of computing machines: 
those like the Bush differential analyzer, 1 which are known as 
análogy machines , where the data are represented by measurements 
on some continuous scale, so that the accuracy of the machine is 
determined by the accuracy of construction of the scale; and those, 
like the ordinary desk adding and multiplying machine, which we cali 
numerical machines, where the data are represented by a set of choices 
among a. number of contingencies, and the accuracy is determined by 
the sharpness with which the contingencies are distinguished, the 
number of aiternative contingencies presented at every choice, and 
the number of choices given. We see that for highly accurate work, 
at any rate, the numerical machines are preferabíe, and above all, 
those numerical machines construeted on the binary scale, in which 

1 Journal of the Franklin Inetüute, various papera, 1930 on. 
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the number of alternatives presented at each choice is two. Our use 
of machines on the decimal scaie is conditioned mereiy by the 
historicai accident that the scaie of ten, based on our fingers and 
thumbs, was already in use when the Hindus made the great dis- 
covery of the importance of the zero and the advantage of a positional 
system of notation. It is worth retaining when a iarge part of the 
work done with the aid of the machine consists in transcribing onto 
the machine numbers in the conventionai decimal form, and in taking 
off the machine numbers which must be written in the same con- 
ventional form. 

This is, in fact, the use of the ordinary desk computing machine, 
as employed in banks, in business offices, and in many statistical 
laboratories. It is not the way that the larger and more automatic 
machines are best to be employed; in general, any computing 
machine is used because machine methods are faster than hand 
methods. In any combined use of means of computation, as in any 
combination of Chemical reactions, it is the slowest which gives the 
order of magnitude of the time constante of the entire system. It is 
thus advantageous, as far as possible, to remove the human eiement 
from any elabórate chain of computation and to introduce it only 
where it is absolutely unavoidable, at the very beginning and the 
very end. Under these conditions, it pays to have an instrument for 
the change of the scaie of notation, to be used initially and finally in 
the chain of computations, and to perform all intermedíate processes 
on the binary scaie. 

The ideal computing machine must then have all its data inserted 
at the beginning, and must be as free as possible from human inter- 
ference to the very end. This means that not only must the numeri- 
cal data be inserted at the beginning, but also all the rules for 
combining them, in the form of instructions covering every situation 
which may arise in the course of the computation. Thus the 
computing machine must be a logical machine as well as an arithmetic 
machine and must combine contingencies in accordance with a 
systematic algorithm. While there are many algorithms which 
migkt be used for combining contingencies, the simplest of these is 
known as the algebra of logic par excellence , or the Boolean algebra. 
This algorithm, like the binary arithmetic, is based on the dichotomy, 
the choice between yes and no, the choice between being in a class 
and outside. The reasons for its superiority to other systems are 
of the same nature as the reasons for the superiority of the binary 
arithmetic over other arithmetics. 

Thus all the data, numérica! or logical, put into the machine are 
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in the form of a set of choices between two aítematives, and alí the 
operations on the data take the form of making a set of new choices 
depend on a set of oíd choices. When I add two one-digit numbers, 
A and B, I obtain a two-digit number commencing with 1, if A and 
B are both 1, and otherwise with 0. The second digit is 1 if A B, 
and is otherwise 0. The addition of numbers of more than one digit 
follows similar but more complicated rules. Multiplication in the 
binary system, as in the decimal, may be reduced to the multiplica¬ 
tion table and the addition of numbers, and the rules for multiplica¬ 
tion for binary numbers take on the peculiarly simple form given by 
the table 


X 

0 

1 

0 

0 

0 

1 

0 

1 


Thus multiplication is simply a method to determine a set of new 
digits when oíd digits are given. 

On the logical side, if O is a negative and 7 a positive decisión, 
every operator can be derived from three: negation, which transforms 
7 into O and O into 7; logical addition, with the table 

®OI 

OOI (5.09) 

III 

and logical multiplication, with the same table as the numerical 
multiplication of the (1,0) system, namely, 


0 

0 

0 

I 

0 

I 


That is, every contingency which may arise in the operation of the 
machine simply demands a new set of choices of contingencies I and 
O, depending according to a fixed set of rules on the decisions already 
made. In other words, the structure of the machine is that of a 
bank of relays, capable each of two conditions, say “on” and “off”; 
while at each stage the relays assume each a position dictated by the 
positions of some or all the relays of the bank at a previous stage of 
operation. These stages of operation may be definitely “clocked” 
from some central dock or docks, or the action of each relay may be 
held up until all the relays which should have acted earlier in the 
process have gone through all the steps calied for. 

The relays used in a computing machine may be of very varied 
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character. They may be purely mechanical, or they may be 
electro-meehaniea!, as in the case of a soienoidal relay, in which the 
armature will remain in one of two possible positions of equilibrium 
until an appropriate impulse pulís it to the other side. They may 
be purely eléctrica! systems with two alternativo positions of equilib- 
rium, either in the form of gas-filled tubes, or, what is much more 
rapid, in the form of high-vacuum tubes. The two possible states of 
a relay system may both be stable in the absence of outside inter- 
ference, or only one may be atable, while the other is transitory. 
Always in the second case and generally in the first case, it will be 
desirable to have special apparatus to retain an impulse which is to 
act at some future time, and to avoid the clogging up of the system 
which will ensue if one of the relays does nothing but repeat itself 
indefinitely. However, we shall have more to say concerning this 
question of memory later. 

It is a noteworthy fact that the human and animal nervous 
systems, which are known to be capable of the work of a computation 
system, contain elements which are ideally suited to act as relays. 
These elements are the so-called neurons or nerve cells. While they 
show rather complicated properties under the influence of electrical 
currents, in their ordinary physiological action they conform very 
nearly to the “all-or-none” principie; that is, they are either at rest, 
or when they “fire” they go through a series of changes almost 
independent of the nature and intensity of the stimulus. There is 
first an active phase, transmitted from one end to the other of the 
neuron with a definite velocity, to which there succeeds a refractory 
period during which the neuron is either incapable of being stimu- 
lated, or at any rate is not capable of being stimulated by any normal, 
physiological process. At the end of this effective refractory period, 
the nerve remains inactive, but may be stimulated again into 
activity. 

Thus the nerve may be taken to be a relay with essentially two 
states of activity: firing and repose. Leaving aside those neurons 
which accept their messages from free endings or sensory end organs, 
each neuron has its message fed into it by other neurons at points 
of contact known as synapses. For a given outgoing neuron, these 
vary in number from a very few to many hundred. It is the state of 
the incoming impulses at the various synapses, combined with the 
antecedent state of the outgoing neuron itself, which determines 
whether it will fire or not. If it is neither firing ñor refractory, and 
the number of incoming synapses which “fire” within a certain 
very short fusión interval of time exceeds a certain threshold, 
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then the neuron will fire after a known, fairly constant synaptic 
del ay. 

Thisisperhapsanoversimplificationofthepicture: the “threshold” 
may not depend simply on the number of synapses but on their 
“ weight” and their geometrical relations to one another with respect 
to the neuron into which they feed; and there is very convincing 
evidence that there exist synapses of a different nature, the so-called 
“inhibitory synapses,” which either compietely prevent the firing of 
the outgoing neuron or at any rate raise its threshold with respect to 
stimulation at the ordinary synapses. What is pretty clear, however, 
is that some definite combinations of impulses on the incoming 
neurons having synaptic connections with a given neuron will cause 
it to fire, while others will not cause it to fire. This is not to say that 
there may not be other, non-neuronic influences, perhaps of a humoral 
nature, which produce slow, secular changes tending to vary that 
pattern of incoming impulses which is adequate for firing. 

A very important function of the nervous system, and, as we have 
said, a function equally in demand for computing machines, is that 
of memory, the ability to preserve the results of past operations for 
use in the future. It will be seen that the uses of the memory are 
highly various, and it is improbable that any single mechanism can 
satisfy the demande of all of them. There is first the memory which 
is necessary for the carrying out of a current process, such as a 
multiplication, in which the intermedíate results are of no valué 
when once the process is completed, and in which the operating 
apparatus should then be released for further use. Such a memory 
should record quickly, be read quickly, and be erased quickly. On 
the other hand, there is the memory which is intended to be part of 
the files, the permanent record, of the machine or the brain, and to 
contribute to the basis of all its future behavior, at least during a 
single run of the machine. Let it be remarked parenthetically that 
an important difference between the way in which we use the brain 
and the machine is that the machine is intended for many successive 
runs, either with no reference to each other, or with a minimal, 
limited reference, and that it can be cleared between such runs; 
while the brain, in the course of nature, never even approximately 
clears out its past records. Thus the brain, under normal circum- 
stances, is not the complete analogue of the computing machine but 
rather the analogue of a single run on such a machine. We shall see 
later that this remark has a deep signifieance in psychopathology and 
in psychiatry. 

To return to the problem of memory, a very satisfactory method 
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for constructing a short-time memory is to keep a sequence of im¬ 
pulses fcraveling around a closed cireuit until this Circuit is cleared by 
intervention from outside. There is much reason to believe that 
this happens in our brains during the retention of impulses, which 
occurs o ver what is known as the specious present. This method 
has been imitated in several devices which have been used in com- 
puting machines, or at least suggested for such a use. There are two 
conditions which are desirable in such a retentive apparatus: the 
impulse should be transmitted in a médium in which it is not too 
difficult to achieve a considerable time lag; and before the errors 
inherent in the instrument have blurred it too much, the impulse 
should be reconstructed in a form as sharp as possible. The first 
condition tends to rule out delays produced by the transmission of 
light, or even, in many cases, by electric circuits, while it favors the 
use of oije form or another of elastic vibrations; and such vibrations 
have actually been employed for this purpose in computing machines. 
If electric circuits are used for delay purposes, the delay produced at 
every stage is relatively short; or, as in all pieces of linear apparatus, 
the deformation of the message is cumulative and very soon becomes 
intolerable. To avoid this, a second consideration comes into play; 
we must insert somewhere in the cycle a relay which does not serve 
to repeat the form of the incoming message but rather to trigger off a 
new message of prescribed form. This is done very easily in the 
nervous system, where indeed all transmission is more or less of a 
trigger phenomenon. In the electrical industry, pieces of apparatus 
for this purpose have long been known and have been used in 
connection with telegraph circuits. They are known as telegraph- 
type repeaters. The great difficulty of using them for memories of 
long duration is that they have to function without a flaw over an 
enormous number of consecutive cycies of operation. Their success 
is all the more remarkable: in a piece of apparatus designed by Mr. 
Williams of the University of Manchester, a device of this sort with a 
unit delay of the order of a hundredth of a second has continued in 
successful operation for several hours. What makes this more 
remarkable is that this apparatus was not used merely to preserve a 
single decisión, a single “yes” or “no,” but a matter of thousands of 
decisions. 

Like other forms of apparatus intended to retain a large number of 
decisions, this works on the scanning principie. One of the simplest 
modes of storing information for a relatively short time is as the 
charge on a condenser; and when this is supplemented by a telegraph- 
type repeater, it becomes an adequate method of storage. To use 
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to the best advantage the Circuit facilities attached to such a storage 
system, it is desirable to be able to switch successiveiy and very 
rapidly from one condenser to another. The ordinary means of 
doing this involve meehanical inertia, and this is never consistent 
with very high speeds. A much better way is the use of a large 
number of condensers, in which one píate is either a small piece 
of metal sputtered in to a dielectric, or the imperfectly insulating 
surface of the dielectric itself, while one of the connectors to these 
condensers is a pencil of cathode rays moved by the condensers and 
magnets of a sweep circuit over a course like that of a plough in a 
ploughed field. There are various elaborations of this method, 
which indeed was employed in a somewhat different way by the 
Radio Corporation of America before it was used by Mr. Williams. 

These last-named methods for storing information can hold a 
message for quite an appreciable time, if not for a period comparable 
with a human lifetime. For more permanent records, there is a 
wide variety of altematives among which we can choose. Leaving 
out such bulky, slow, and unerasable methods as the use of punched 
cards and punched tape, we have magnetic tape, together with its 
modern refinements, which have largely eliminated the tendency of 
messages on this material to spread; phosphorescent substances; and 
above all, photography. Photography is indeed ideal for the per- 
manence and detail of its records, ideal again from the point of view 
of the shortness of exposure needed to record an observation. It 
suffers from two grave disadvantages: the time needed for develop- 
ment, which has been reduced to a few seconds, but is still not small 
enough to make photography available for a short*time memory; 
and (at present [1947]) the fact that a photographic record is not 
subject to rapid erasure and the rapid implanting of a new record. 
The Eastman people have been working on just these problema, 
which do not seem to be necessarily insoluble, and it is possible that 
by this time they have found the answer. 

Very many of the methods of storage of information already 
considered have an important physical element in common. They 
seem to depend on systems with a high degreeof quantum degeneracy, 
or, in other words, with a large number of modes of vibration of the 
same frequency. This is certainly true in the case of ferromagnetism, 
and is also true in the case of materials with an exceptionally high 
dielectric constant, which are thus especially valuable for use in 
condensers for the storage of information. Phosphorescence as welí 
is a phenomenon associated with a high quantum degeneracy, and 
the same sort of effect makes its appearance in the photographic 
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process, where many of the substances which act as developers seem 
to have a great deal of internal resonance. Quantum degeneracy 
appears to be asaoeiated with the ability to make small causes 
produce appreciable and stable effects. We have aiready seen in 
Chapter II that substances with high quantum degeneracy appear 
to be associated with many of the problema of metabolism and 
reproduction. It is probably not an accident that here, in a non- 
living environment, we find them associated with a third fundamental 
property of living matter: the ability to receive and organize impulses 
and to make them effective in the outer world. 

We have seen in the case of photography and similar processes that 
it is possi ble to store a message in the form of a permanent alteration 
of certain storage eiements. In reinserting this information into the 
system, it is necessary to cause these changes to affect the messages 
going through the system. One of the simplest ways to do this is to 
have, as the storage eiements which are changed, parts which 
normally assist in the transmission of messages, and of such a nature 
that the change in their character due to storage affects the manner in 
which they will transport messages for the entire future. In the 
nervous system, the neurons and the synapses are eiements of this 
sort, and it is quite plausible that information is stored over long 
periods by changes in the thresholds of neurons, or, what may be 
regarded as another way of saying the same thing, by changes in the 
permeability of each synapse to messages. Many of us think, in the 
absence of a better explanation of the phenomenon, that the storage 
of information in the brain can actually occur in this way. It is 
conceivable for such a storage to take place either by the opening of 
new paths or by the closure of oíd ones. Apparently it is adequately 
established that no neurons are formed in the brain after birth. It 
is possible, though not certain, that no new synapses are formed, and 
it is a plausible conjecture that the chief changes of thresholds in the 
memory process are increases. If this is the case, our whole life is 
on the pattern of Balzac’s Pean de Chagrín, and the very process of 
learning and remembering exhausts our powers of learning and 
remembering until life itself squanders our capital stock of power to 
live. It may well be that this phenomenon does occur. This is a 
possible explanation for a sort of senescence. The real phenomenon 
of senescence, however, is much too complicated to be explained in 
this way alone. 

We have aiready spoken of the computing machine, and con- 
sequently the brain, as a logical machine. It is by no means trivial 
to consider the light cast on logic by such machines, both natural and 
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artificial. Here the chief work is that of Turing. 1 We have said 
before that the machina ratiocinatrix is nothing but the calculus 
ratiocinator of Leibniz with an engine in it; and just as modera 
mathematicai logie begins with this calculus, so it is inevitable that 
its present engineering development should cast a new light on logic. 
The Science of today is operational; that is, it considers every state- 
ment as essentialiy concerned with possi ble experimente or observable 
processes. According to this, the study of logic must reduce to the 
study of the logical machine, whether nervous or mechanical, with 
all its non-removable limitations and imperfections. 

It may be said by some readers that this reduces logic to psy- 
chology, and that the two Sciences are observably and demonstrably 
different. This is true in the sense that many psychological states 
and sequences of thought do not conform to the canons of logic. 
Psychology contains much that is foreign to logic, but—and this is 
the important fact—any logic which means anything to us can 
contain nothing which the human mind—and henee the human 
nervous system—is unable to encompass. All logic is limited by the 
limitations of the human mind when it is engaged in that activity knotm 
as logical thinking. 

For exampie, we devote much of mathematics to discussions 
involving the infinite, but these discussions and their accompanying 
proofs are not infinite in fact. No admissible proof involves more 
than a finite number of stages. It is true, a proof by mathematicai 
induction seems to invoive an infinity of stages, but this is only 
apparent. In fact, it involves just the following stages: 

1. P n is a proposition involving the number n. 

2. P n has been proved for n = 1. 

3. If P n is true, P n +i is trae. 

4. Therefore, P n is true for every positive integer n. 

It is true that somewhere in our logical assumptions there must be 
one which validates this argument. However, this mathematicai 
induction is a far different thing from complete induction over an 
irífinite set. The same thing is trae of the more refined forms of 
mathematicai induction, such as transfinite induction, which occur in 
certain mathematicai disciplines. 

Thus some very interesting situations arise, in which we may be 
able—with enough time and enough computational aids—to prove 

1 Turing, A. M., “On Computable Numbers with an Application tothe Entachei- 
dungsproblem,” Proceedinga of the London Mathematicai Society, Ser. 2, 42, 230- 
266 (1936). 
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every single case of a theorem P n ; but if there is no systematic way 
of subsuming these proofs under a single argument independent of n, 
such as we find in mathematical induction, it may be impossible to 
prove Pnfor aU n. This contingency is recognized in what is known 
as metamathematics, the discipline so brilliantíy developed by 
Godel and his school. 

A proof representa a logical process which has come to a definitivo 
conclusión in a finite number of stages. However, a logical machine 
following definite rules need never come to a conclusión. It may go 
on grinding through different stages without ever coming to a stop, 
either by describing a pattern of activity of continually increasing 
complexifcy, or by going infco a repetitive process like the end of a 
chess game in which there is a continuing cycle of perpetual check. 
This occurs in the case of some of the paradoxes of Cantor and Russell. 
Let us consider the class of all classes which are not members of them- 
selves. In this class a member of itself? If it is, it is certainly not 
a member of itself; and if it is not, it is equally certainly a member 
of itself. A machine to answer this question would give the suc- 
cessive temporary answers: “yes/'" “no,” “yes,” “no,” and so on, 
and would never come to equilibrium. 

Bertrand Russell’s solution of his own paradoxes was to affix to 
every statement a quantity, the so-called type, which serves to 
distinguish between what seems to be formally the same statement, 
according to the character of the objects with which it concems itself 
—whether these are “things,” in the simplest sense, classes of 
“things,” classes of classes of “things,” etc. The method by which 
we resolve the paradoxes is also to attach a parameter to each state¬ 
ment, this parameter being the time at which it is asserted. In both 
cases, we introduce what we may cali a parameter of uniformization, 
to resolve an ambiguity which is simply due to its neglect. 

We thus see that the logic of the machine resembles human logic, 
and, following Turing, we may employ it to throw light on human 
logic. Has the machine a more eminently human characteristic as 
well—the ability to learn? To see that it may well ha ve even this 
property, let us consider two closely related notions: that of the 
association of ideas and that of the conditioned reflex. 

In the British empirical school of philosophy, from Locke to Hume, 
the content of the mind was considered to be made up of certain 
entities known to Locke as ideas, and to the later authors as ideas 
and impressions. The simple ideas or impressions were supposed to 
exist in a purely passive mind, as free from inñuence on the ideas it 
contained as a clean blackboard is on the symbols which may be 
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written on it. By some sort of inner activity, hardly worthy to be 
caiied a forcé, theae ideas were supposed to unite themselves into 
bundies, aecording to the principies of similarity, contiguity, and 
cause and effect. Of these principies, perhaps the most significant 
was contiguity: ideas or impressions which had often occurred 
together in time or in space were supposed to have acquíred the 
ability of evoking one another, so that the presence of any one of them 
would produce the entire bundle. 

In all this there is a dynamics implied, but the idea of a dynamics 
had not yet filtered through from physics to the biological and 
psychological Sciences. The typicai biologist of the eighteenth 
century was Línnaeus, the collector and ciassiñer, with a point of 
view quite opposed to that of the evolutionists, the physiologists, 
the geneticists, the experimental embryologists of the present day. 
Indeed, with so much of the world to explore, the State of mind of 
the biologists could hardly have been different. Similarly, in psy- 
chology, the notion of mental content dominated that of mental 
process. This may well have been a survival of the scholastic 
emphasis on substances, in a world in which the noun was hypos- 
tasized and the verb carried little or no weight. Nevertheless, the 
step from these static ideas to the more dynamic point of view of the 
present day, as exemplified in the work of Pavlov, is perfectly olear. 

Pavlov worked much more with animáis than with men, and he 
reported visible actions rather than introspective States of mind. 
He found in dogs that the presence of food causes the increased 
secretion of saliva and of gastric juice. If then a certain visual 
objectisshown todogs in the presence of food and only in the presence 
of food, the sight of this object in the absence of food will acquire the 
property of being by itself able to stimulate the flow of saliva or of 
gastric juice. The unión by continguity which Locke had observed 
introspectively in the case of ideas now becomes a similar unión of 
pattems of behavior. 

There is one important difference, however, between the point of 
view of Pavlov and that of Locke, and it is precisely due to this fact 
that Locke considers ideas and Pavlov patterns of action. The 
responses observed by Pavlov tend to carry a process to a successful 
conclusión or to avoid a catastrophe. Salivation is important for 
deglutition and for digestión, while the avoidance of what we should 
consider a painful stimulus tends to protect the animal from bodily 
injury. Thus there enters into the conditioned reflex something 
that we may cali ajfective tone. We need not associate this with our 
own sensations of pleasure and pain, ñor need we in the abstract 
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associate it with the advantage of the animal. The essential thing 
is this: that affective tone is arranged on some sort of scaie from 
negative “ pain ” to positive “ pleasure fchat for a considerable time, 
or permanently, an increase in affective tone favors all processes in 
the nervous system that are under way at the time and gives them a 
secondary power to increase affective tone; and that a decrease in 
affective tone tends to inhibit all processes under way at the time 
and gives them a secondary ability to decrease affective tone. 

Biologically speaking, of course, a greater affective tone must 
occur predominantiy in situations favorable for the perpetuaron of 
the race, if not the individual, and a smaller affective tone in situa¬ 
tions which are unfavorable for this perpetuaron, if not disastrous. 
Any race not conforming'to this requirement will go the way of 
Lewis Carrol l’s Bread-and-Butter Fly, and always die. Nevertheless, 
even a doomed race may show a mechanism valid so long as the race 
lasts. In other words, even the most suicidal apportioning of 
affective tone will produce a definite pattem of conduct. 

Note that the mechanism of affective tone is itself a feedback 
mechanism. It may even be given a diagram such as shown in 
Fig. 7. 



Fio. 7 


Here the totalizer for affective tone combines the affective tones 
given by the sepárate affective-tone mechanisms over a short 
interval in the past, according to some rule which we need not specify 
now. The leads back to the individual affective-tone mechanisms 
serve to modify the intrinsic affective tone of each process in the 
direction of the output of the totalizer, and this modification stands 
un til it is modified by later messages from the totalizer. The leads 
back from the totalizer to the process mechanisms serve to lower 
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thresholds if the total affective tone is increasing, and to raise them 
if the total affective tone is decreasing. They likewise have a iong- 
time effect, which endures until it is modified by another impulse from 
the totalizer. This lasting effect, however, is confined to those 
processes actually in being at the time the return message arrives, 
and a similar limitation also applies to the effects on the individual 
affective-tone mechanisms. 

I wish to emphasize that I do not say that the process of the 
conditioned reflex operates according to the mechanism I have 
given; I merely say that it could so opérate. If, however, we 
assume this or any simular mechanism, there are a good many 
things we can say concerning it. One is that this mechanism is 
capable of learning. It has already been recognized that the con¬ 
ditioned reflex is a learning mechanism, and this idea has been used in 
the behaviorist studies of the learning of rats in a maze. All that is 
needed is that the inducements or punishments used have, re- 
spectively, a positive and a negative affective tone. This is certainly 
the case, and the experimenter learns the nature of this affective 
tone by experience, not simply by a priori considerations. 

Another point of considerable interest is that such a mechanism 
involves a certain set of messages which go out generally into the 
nervous system, to all elements which are in a State to receive them. 
These are the return messages from the affective-tone totalizer, and 
to a certain extent the messages from the affective-tone mechanisms 
to the totalizers. Indeed, the totalizer need not be a sepárate 
element but may merely represent some natural combinatory effect 
of messages arriving from the individual affective-tone mechanisms. 
Now, such messages “to whom it may concern” may well be sent out 
most efficiently, with a smallest cost in apparatus, by channels other 
than nervous. In a similar manner, the ordinary communication 
system of a mine may consist of a telephone central with the attached 
wiring and pieces of apparatus. When we want to empty a mine in 
a hurry, we do not trust to this, but break a tube of a mercaptan in 
the air intake. Chemical messengers like this, or like the hormones, 
are the simplest and most effective for a message not addressed to a 
speeific recipient. For the moment, let me break into what I know 
to be puré fancy. The high emotional and consequently affective 
content of hormonal activity is most suggestive. This does not 
mean that a purely nervous mechanism is not capable of affective 
tone and of learning, but it does mean that in the study of this aspect 
of our mental activity, we cannot afford to be blind to the possi- 
bilities of hormonal transmission. It may be excessively fanciful to 
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attach this notion to the fact that in the theories of Freud the 
memory—the storage function of the nervous system—and the 
aetivities of sex are both invoived. Sex, on the one hand, and all 
affective content, on the other, contain a very strong hormonal 
element. This suggestion of the importance of sex and hormones has 
been made to me by Dr. J. Lettvin and Mr. Oliver Selfridge. While 
at present there is no adequate evidence to prove its validity, it is 
not manífestly absurd in principie. 

There is nothing in the nature of the computing machine which 
forbids it to show conditioned reflexes. Let us remember that a 
computing machine in action is more than the concatenation of 
relays and storage mechanisms which the designer has built into it. 
It also contains the content of its storage mechanisms, and this 
content is never completely cleared in the course of a single run. 
We have already seen that it is the run rather than the entire existence 
of the t mechanical structure of the computing machine which 
corresponds to the life of the individual. We have also seen that in 
the nervous computing machine it is highly probable that informa- 
tion is stored largely as changes in the permeability of the synapses, 
and it is perfectly possible to construct artificial machines where 
information is stored in that way. It is perfectly possible, for ex- 
ample, to cause any message going into storage to change in a 
permanent or semi-permanent way the grid bias of one or of a 
number of vacuum tubes, and thus to alter the numerical valué of 
the summation of impulses which wili make the tube or tubes fíre. 

A more detailed account of learning apparatus in computing and 
control machines, and the uses to which it may be put, may well be 
left to the engineer rather than to a preiiminary book like this one. 
It is perhaps better to devote the rest of this chapter to the more 
developed, normal uses of modern computing machines. One of the 
chief of these is in the solution of partial differential equations. Even 
linear partial differential equations require the recording of an enor- 
mous mass of data to set them up, as the data involve the accurate 
description of functions of two or more variables. With equations 
of the hyperbolic type, like the wave equation, the typical probíem 
is that of solving the equation when the initial data are given, and 
this can be done in a progressive manner from the initial data to the 
results at any given later time. This is largely true of equations of 
the parabolic type as well. When it comes to equations of the elliptic 
type, where the natural data are boundary valúes rather than 
initial valúes, the natural methods of solution involve an iterative 
process of successive approximation. This procesa is repeated a 
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very large number of times, so that very fast mefchods, such as those 
of the modem computing machine, are almost indispensable. 

In non-linear partial differential equations, we miss what we have 
in the case of the linear equations—a reasonably adequate, purely 
mathematical theory. Here computational methods are not only 
important for the handling of particular numerical cases, but, as 
von Neumann has pointed out, we need them in order to form that 
acquaintance with a large number of particular cases without which 
we can scarcely formúlate a general theory. To some extent this 
has been done with the aid of very expensive experimental apparatus, 
such as wind tunnels. It is in this way that we have become 
acquainted with the more complicated properties of shock waves, 
slip surfaces, turbulence, and the like, for which we are scarcely in a 
position to give an adequate mathematical theory. How many 
undiscovered phenomena of similar nature there may be, we do not 
know. The analogue machines are so much less accurate, and in 
many cases so much slower, than the digital machines that the latter 
give us much more promise for the future. 

It is already becoming clear in the use of these new machines that 
they demand purely mathematical techniques of their own, quite 
different from those in use in manual computation or in the use of 
machines of smaller capacity. For example, even the use of 
machines for computing determinants of moderately high order or 
for the simultaneous solution of twenty or thirty simultaneous 
linear equations shows difficulties which do not arise when we study 
analogous problema of small order. Unless care is exercised in 
setting up a problem, these may completely deprive the solution of 
any signiticant figures whatever. It is a commonplace to say that 
fine, effective tools like the ultra-rapid computing machine are out 
of place in the hands of those not possessing a sufficient degree of 
technical skill to take full advantage of them. The ultra-rapid 
computing machine will certainly not decrease the need for mathe- 
maticians with a high level of understanding and technical training. 

In the mechanical or eléctrica! construction of computing machines, 
there are a few maxims which deserve consideration. One is that 
mechanisms which are relatively frequently used, such as multiplying 
or adding mechanisms, should be in the form of relatively standard- 
ized assemblages adapted for one particular use and no other, while 
those of more occasional use should be assembled for the moment of 
use out of elements also available for other purposes. Closely related 
to this consideration is the one that in these more general mechanisms 
the component parts should be available in accordance with their 
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genera! properties, and shouid nofc be allotted permanently to a 
speeific association with other pieces of apparatus. There shouid be 
some part of the apparatus, iike an automatic telephone-switching 
exehange, which will search for free eomponents and connectors of the 
various sorts and allot them as they are needed. This wil! elimínate 
much of the very large expense which is due to having a great 
number of unused elements which cannot be used unless their entire 
large assembly is used. We shail find this principie is very importan! 
when we come to consider traffic problems and overloading in the 
nervous system. 

As a final remark, let me point out that a large computing machine, 
whether in the form of mechanical or eleetric apparatus or in the 
form of the brain itself, uses up a considerable amount of power, all 
of which is wasted and dissipated in heat. The blood ieaving the 
brain is a fraction of a degree warmer than that entering it. No 
other computing machine approaches the economy of energy of the 
brain. In a large apparatus like the Eniac or Edvac, the filaments 
of the tubes consume a quantity of energy which may well be 
measured in kilowatts, and unless adequate ventilating and cooling 
apparatus is provided, the system will suffer from what is the 
mechanical equivalent of pyrexia, until the constants of the machine 
are radically changed by the heat, and its performance breaks down. 
Nevertheless, the energy spent per individual operation is almost 
vanishingly small, and does not even begin to form an adequate 
measure of the performance of the apparatus. The mechanical 
brain does not secrete thought “as the liver does bile,” as the earlier 
materialista claimed, ñor does it put it out in the form of energy, as 
the muscle puts out its activity. Information is information, not 
matter or energy. No materialism which does not admit this can 
survive at the present day. 
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Among other things which we have discussed in the previous 
chapter is the possibilifcy of aasigning a neural mechanism to Locke’s 
theory of the association of ideas. According to Locke, this occurs 
according to three principies: the principie of contiguity, the 
principie of similarity, and the principie of cause and effect. The 
third of these is reduced by Locke, and even more definitivély by 
Hume, to nothing more than constant concomitance, and so is 
subsumed under the first, that of contiguity. The second, that of 
similarity, deserves a more detailed discussion. 

How do we recognize the identity of the features of a man, whether 
we see him in profile, in three-quarters face, or in full face? How do 
we recognize a circle as a circle, whether it is large or small, near or 
far; whether, in fact, it is in a plañe perpendicular to a line from the 
eye meeting it in the middle, and is seen as a circle, or has some other 
orientation, and is seen as an ellipse? How do we see faces and 
animáis and mapa in clouds, or in the blots of a Rorschach test? All 
these examples refer to the eye, but similar problema extend to the 
other senses, and some of them have to do with intersensory relations. 
How do we put into words the cali of a bird or the stridulations of an 
insect? How do we identify the roundness of a coin by touch? 

For the present, let us confine ourselves to the sense of visión. 
One important factor in the comparison of form of different objects is 
certainly the interaction of the eye and the muscles, whether they 
are the muscles within the eyeball, the muscles moving the eyeball, 
the muscles moving the head, or the muscles moving the body as a 
whole. Indeed, some form of this visual-muscular feedback system 
133 
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is important as low in the animal kingdom as the flatworms. There 
the negative phototropism, the tendency to avoid the light, seems to 
be controlled by the balance of the impulses from the two eyespots. 
This balance is fed back to the muscles of the trunk, tuming the 
body away from the light, and, in combination with the general 
impulse to move forward, carries the animal into the darkest región 
accessible. It is interesting to note that a combination of a pair of 
photocells with appropriate ampíifiers, a Wheatstone bridge for 
balancing their outputs, and further ampíifiers controlling the input 
into the two motora of a twinscrew mechanism would give us a very 
adequate negatively phototropic control for a littie boat. It would 
be difficult or impossible for us to compress this mechanism into the 
dimensions that a flatworm can carry; but here we merely ha ve 
another exemplification of the fact that must by now be familiar to 
the reader, that living mechanisms tend to have a much smaller 
space scale than the mechanisms best suited to the techniques of 
human artificera, although, on the other hand, the use of electrical 
techniques gives the artificial mechanism an enormoua advantage in 
speed o ver the living organism. 

Without going through all the intermedíate stages, let us come at 
once to the eye-muscle feedbacks in man. Some of these are of 
purely homeostatic nature, as when the pupil opens in the dark and 
closes in the light, thus tending to confine the flow of light into the 
eye between narrower bounds than would otherwise be possible. 
Others concern the fact that the human eye has economically con- 
fined its best form and color visión to a relatively small fovea, while 
its perception of motion is better on the periphery. Whcn the 
peripheral visión has picked up some object conspicuous by brilliancy 
or light contrast or color or above all by motion, there is a refiex 
feedback to bring it into the fovea. This feedback is accompanied 
by a complicated system of interlinked subordínate feedbacks, which 
tend to converge the two eyes so that the object attracting attention 
is in the same part of the visual field of each, and to focus the lens 
so that its outlines are as sharp as possible. These actions are 
supplemented by motions of the head and body, by which we bring 
the object into the center of visión if this cannot be done readily by a 
motion of the eyes alone, or by which we bring an object outside 
the visual field picked up by some other sense into that field. In the 
case of objects with which we are more familiar in one angular 
orientation than another—writing, human faces, landscapes, and 
the like—there is also a mechanism by which we tend to pulí them 
into the proper orientation. 
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All these processes can be summed up in one sentence: we tend to 
bring any object that attracts our afctention into a standard position 
and orientation, so that the visual image which we form of it varíes 
within as small a range as possible. This does not exhaust the 
processes which are involved in perceiving the form and meaning of 
the object, but it certainly facilitates all later processes tending to 
this end. These later processes occur in the eye and in the visual 
cortex. There is considerable evidence that for a considerable 
number of stages each step in this process diminishes the number of 
neuron channels involved in the transmission of visual information, 
and brings this information one step nearer to the form in which it is 
used and is preserved in the memory. 

The first step in this concentration of visual information occurs in 
the transition between the retina and the optic nerve. It will be 
noted that while in the fovea there is almost a one-one correspond- 
ence between the rods and cones and the fibers of the optic nerve, 
the correspondence on the periphery is such that one optic nerve 
fiber corresponds to ten or more end organs. This is quite under- 
standable, in view of the fact that the chief function of the peripheral 
fibers is not so much visión itself as a pickup for the centering and 
focusing-directing mechanism of the eye. 

One of the most remarkable phenomena of visión is our ability to 
recognize an outline drawing. Clearly, an outline drawing of, say, 
the face of a man, has very little resemblance to the face itself in 
color, or in the massing of light and shade, yet it may be a most 
recognizable portrait of its subject. The most plausible explanation 
of this is that, somewhere in the visual process, outlines are emphasized 
and some other aspects of an image are minimized in importance. 
The beginning of these processes is in the eye itself. Like all senses, 
the retina is subject to accommodation; that is, the constant main- 
tenance of a stimulus reduces its ability to receive and to transmit 
that stimulus. This is most markediy so for the receptora which 
record the interior of a large block of images with constant color and 
illumination, for even the slight fluctuations of focus and point of 
fixation which are inevitable in visión do not change the character 
of the image received. It is quite different on the boundary of two 
contrasting regions. Here these fluctuations produce an alternation 
between one stimulus and another, and this alternation, as we see in 
the phenomenon of after-images, not only does not tend to exhaust 
the visual mechanism by accommodation but even tends to enhance 
its sensitivity. This is true whether the contrast between the two 
adjacent regions is one of light intensity or of color. As a comment 
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on these facts, let us note that three-quarters of the fibers in the 
optic nerve respond only to the ñashing “on” of illumination. We 
thus find that the eye receives its most intense impression at bound- 
aries, and that every visual image in fact has something of the nature 
of a line drawing. 

Probably not ali of this action is peripheral. In photography, it 
is known that certain treatments of a píate increase its contrasta, 
and such phenomena, which are of non-linearity, are certainly not 
beyond what the nervous System can do. They are allied to the 
phenomena of the telegraph-type repeater, which we have aiready 
mentioned. Like this, they use an impression which has not been 
blurred beyond a certain point to trigger a new impression of a 
standard sharpness. At any rate, they decrease the total unusable 
information carried by an image, and are probably correlated with a 
part of the reduction of the number of transmission fibers found at 
various stages of the visual cortex. 

We have thus designated several actual or possible stages of the 
diagrammatization of our visual impressions. We center our 
images around the focus of attention and reduce them more or less to 
outlines. We have now to compare them with one another, or at 
any rate with a standard impression stored in memory, such as 
“circle” or “square.” This may be done in several ways. We 
have given a rough sketch which indicates how the Lockean principie 
of contiguity in association may be mechanized. Let us notice that 
the principie of contiguity also covers much of the other Lockean 
principie of similarity. The different aspects of the same object 
are often to be seen in those processes which bring it to the focus of 
attention, and of other motions which lead us to see it, now at one 
distance and now at another, now from one angle and now from a 
distinct one. This is a general principie, not confined in its applica* 
tion to any particular sense and doubtless of much importance in the 
comparison of our more complicated experiences. It is nevertheless 
probably not the only procese which leads to the formation of our 
specificaliy visual general ideas, or, as Locke would cali them, 
“complex ideas.” The structure of our visual cortex is too highly 
organized, too specific, to lead us to suppose that it operates by what 
is after all a highly generalized mechanism. It leaves us the im¬ 
pression that we are here deaiing with a special mechanism which is 
not merely a temporary assemblage of general-purpose elemente 
with interchangeable parts, but a permanent sub-assembly like 
the adding and multiplying assemblies of a computing machine. 
Under the circumstances, it is worth considering how such a 
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sub-assembly might possibly work and how we should go about 
designing it. 

The possi ble perspective transformations of an object form what 
is known as a group, in the sense in which we have already defined 
one in Chapter II. This group defines several sub-groups of trans¬ 
formations: the affine group, in which we consider only those trans¬ 
formations which leave the región at infinity untouched; the 
homogeneous dilations about a given point, in which one point, the 
directions of the axes, and the equality of scale in all directions are 
preserved; the transformations preserving length; the rotations in 
two or three dimensions about a point; the set of all translations; 
and so on. Among these groups, the ones we have just mentioned 
are continuous; that is, the operations belonging to them are 
determined by the valúes of a number of continuously varying 
parameters in an appropríate space. They thus form multi- 
dimensional configurations in n-space, and contain sub-sets of trans¬ 
formations which constitute regions in such a space. 

Now, just as a región in the ordinary two-dimensional plañe is 
covered by the process of scanning known to the televisión engineer, 
by which a nearly uniformly distributed set of sample positions in 
that región is taken to represent the whole, so every región in a 
group-space, including the whole of such a space, can be represented 
by a process of group scanning. In such a process, which is by no 
means confined to a space of three dimensions, a net of positions in 
the space is traversed in a one-dimensional sequence, and this net of 
positions is so distributed that it comes near to every position in the 
región, in some appropriately defined sense. It will thus contain 
positions as near to any we wish as may be desired. If these 
“positions,” or sets of parameters, are actually used to generate the 
appropriate transformations, it means that the results of transforming 
a given figure by these transformations will come as near as we wish 
to any given transformation of the figure by a transformaron 
operator lying in the región desired. If our scanning is fine enough, 
and the región transformed has the máximum dimensionality of the 
regions transformed by the group considered, this means that the 
transformations actually traversed will give a resulting región 
overlapping any transform of the original región by an amount which 
is as large a fraction of its area as we wish. 

Let us then start with a fixed comparison región and a región to be 
compared with it. If at any stage of the scanning of the group of 
transformations the image of the región to be compared under some 
one of the transformations scanned coincides more perfectly with the 
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fixed pattem than a given tolerance allows, this is recorded, and the 
two regions are said to be alike. If this happens at no stage of 
the scanning proeess, they are said to be uniike. This procesa is per- 
fectiy adapted to mechanization, and serves as a method to identify 
the shape of a figure independently of its size or its orientation or of 
whatever transformations may be included in the group-region to be 
scanned. 

If this región is not the entire group, it may weli be that región A 
seems like región B, and that región B seems like región C, while 
región A does not seem like región C. This certainly happens in 
reality. A figure may not show any particular resemblan ce to the 
same figure inverted, at least in so far as the immediate impression— 
one not involving any of the higher processes—is concerned. Never- 
theless, at each stage of its inversión, there may be a considerable 
range of neighboring positions which appear similar. The universal 
“ideas” thus formed are not perfectly distinct but shade into one 
another. 

There are other more sophisticated means of using group scanning 
to abstract from the transformations of a group. The groups which 
we here consider have a “group measure,” a probability density 
which depends on the transformation group itself and does not 
change when all the transformations of the group are altered by 
being preceded or followed by any specific transformation of the 
group. It is possible to sean the group in such a way that the 
density of scanning of any región of a considerable class—that is, 
the amount of time which the variable scanning element passes 
within the región in any complete scanning of the group—is closely 
proportional to its group measure. In the case of such a uniform 
scanning, if we have any quantity depending on a set S of elements 
transformed by the group, and if this set of elements is transformed 
by all the transformations of the group, let us desígnate the quantity 
depending on S by <2(5), and let us use TS to express the transform 
of the set S by the transformation T of the group. Then Q(TS) will 
be the valué of the quantity replacing Q(S) when S is replaced by TS. 
If we average or intégrate this with respect to the group measure 
for the group of transformations T, we shall obtain a quantity which 
we may write in some such form as 

jQ(TS)dT (6.01) 

where the integration is over the group measure. Quantity 6.01 
will be identical for all sets S interehangeable with one another under 
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the transformations of the group, that is, for all sets S which ha ve in 
some sense the same form or Gestalt. It is possible to obtain an 
approximate comparabiiity of form where the integration in Quantity 
6.01 is over lesa than the whoie group, if the integrand Q(TS) is 
small over the región omitted. So much for group measure. 

In recent years, there has been a good deal of attention to the 
problem of the prosthesis of one lost sense by another. The most 
dramatic of the attempts to accompiish this has been the design of 
reading devices for the blind, to work by the use of photoelectrie 
cells. We shal! suppose that these efforts are confined to printed 
matter, and even to a single type face or to a small number of type 
faces. We shall aiso suppose that the alignment of the page, the 
centering of the lines, the traverso from line to Jine are taken care of 
either manually or, as they may well be, automatically. These 
processes correspond, as we may see, to the part of our visual 
Gestalt determination which depends on muscular feedbacks and the 
use of our normal centering, orienting, focusing, and converging 
apparatus. There now ensues the problem of determining the shapes 
of the individual letters as the scanning apparatus passes over 
them in sequence. It has been suggested that this be done by 
the use of several photoelectrie cells placed in a vertical sequence, 
each attached to a sound-making apparatus of a different pitch. 
This can be done with the black of the letters registering either as 
silence or as sound. Let us assume the latter case, and let us assume 
three photocell receptora above one another. Let them record as the 
three notes of a chord, let us say, with the highest note on top and 
the lowest note below. Then the letter capital F, let us say, will 
record 

- Duration of upper note 

- Duration of middle note 

*— Duration of lower note 


The letter capital Z will record 


the letter capital O 



140 


CYBERNETICS 


and so on. With the ordinary help given by our ability to interpret, 
it should not be too difficuit to read such an auditory code, not 
more difficuit than to read Braille, for instance. 

However, all this depends on one thing: the proper relation of the 
photoeells to the vertical height of the letters. Even with standard- 
ized type faces, there still are great variations in the size of the type. 
Thus it is desirable for us to be able to pulí the vertical scale of the 
scanning up or down, in order to reduce the impression of a given 
letter to a standard. We must at least have at our disposal, manually 
or automatically, some of the transformations of the vertical dilation 
group. 

There are several ways we might do this. We might allow for a 
mechanical vertical adjustment of our photoeells. On the other 
hand, we might use a rather large vertical array of photoeells and 
change the pitch assignment with the size of type, leaving those 
above and below the type silent. This may be done, for example, 
with the aid of a schema of two sets of connectors, the inputs coming 
up from the photoeells, and leading to a series of switches of wider 
and wider divergence, and the outputs a series of vertical linea, as in 
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Fig. 8. Here the single lines represent the leads from the photoeells, 
the double lines the leads to the oscillators, the circles on the dotted 
lines the points of connections between incoming and outgoing leads, 
and the dotted lines themselves the leads whereby one or another of a 
bank of oscillators is put into action. This was the device, to which 
we have referred in the introduction, designed by McCulloch for the 
purpose of adjusting to the height of the type face. In the first 
design, the selection between dotted line and dotted line was manual. 

This was the figure which, when shown to Dr. von Bonin, suggested 
the fourth layer of the visual cortex. It was the connectíng circles 
which suggested the neuron cell bodies of this layer, arranged in 
sub-layers of uniformly changing horizontal density, and size 
changing in the opposite direction to the density. The horizontal 
leads are probably fired in some eyelieal order. The whoie apparatus 
seems quite suited to the process of group scanning. There must 
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of course be some proeess of reeombinafcion in time of the upper 
oufcpufcs. 

This then was the device suggested by McCuiloch as that actually 
used in the brain in the detection of visual Gestalt. It representa a 
type of device usable for any sort of group scanning. Something 
similar occurs in other senses as well. In the ear, the transposition 
of musió from one fundamental pitch to another is nothing but a 
translation of the logarithm of the frequency, and may consequently 
be performed by a group-scanning apparatus. 

A group-scanning assembly thus has well-defined, appropriate 
anatomical structure. The necessary switching may be performed 
by independent horizontal leads which furnish enough stimulation 
to shift the thresholds in each level to just the proper amount to 
make them fire when the lead comes on. While we do not know 
all the details of the performance of the machinery, it is not at 
all difficult to conjecture a possibie machine conforming to the 
anatomy. In short, the group-scanning assembly is well adapted 
to form the sort of permanent sub-assembly of the brain corre- 
sponding to the adders or multipliers of the numerical computing 
machine. 

Lastly, the scanning apparatus should have a certain intrinsic 
period of operation which should be identifiable in the performance 
of the brain. The order of magnitude of this period should show in 
the mínimum time required for making direct comparison of the 
shapes of objects different in size. This can be done only when the 
comparison is between two objects not too different in size; otherwise, 
it is a long-time proeess, suggestive of the action of a non-specific 
assembly. When direct comparison seems to be possibie, it appears 
to take a time of the order of magnitude of a tenth of a second. This 
aiso seems to accord with the order of magnitude of the time needed 
by excitation to stimulate all the layers of transverse connectors in 
eyelieal sequence. 

While this eyelieal proeess then might be a localíy determined one, 
there is evidence that there is a widespread synchronism in different 
parts of the cortex, suggesting that it is driven from some clocking 
center. In fact, it has the order of frequency appropriate for the 
alpha rhythm of the brain, as shown in electroencephalograms. We 
may suspect that this alpha rhythm is associated with form percep- 
tion, and that it partakes of the nature of a sweep rhythm, like the 
rhythm shown in the scanning proeess of a televisión apparatus. 
It disappears in deep sleep, and seems to be obscured and overlaid 
with other rhythms, precisely as we might expect, when we are 
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actually looking afc something and the sweep rhythm is acting as 
something like a carrier for other rhythms and activities. It is 
most marked when the eyes are elosed in waking, or when we are 
staring into space at nothing in particular, as in the condition of 
abstraction of a yogi, 1 when it shows an aimost perfect periodicity. 

We have just seen that the problem of sensory prosthesis—the 
probiem of replacing the information normally conveyed through a 
lost sense by information through another sense still available—is 
important and not necessarily insoluble. What makes it more 
hopeful is the fact that the memory and association areas, normally 
approached through one sense, are not locks with a single key but 
are available to store impressions gathered from other senses than 
the one to which they normally beiong. A blinded man, as dis- 
tinguished perhaps from one congenitally blind, not only retaina 
visual memories earlier in date than his accident but is even able 
to store tactile and auditory impressions in a visual form. He may 
feel his way around a room, and yet have an image of how it ought 
to look. 

Thus a part of his normal visual mechanism is accessible to him. 
On the other hand, he has lost more than his eyes: he has also lost 
the use of that part of his visual cortex which may be regarded as a 
fixed assembly for organizing the impressions of sight. It is necessary 
to equip him not only with artificial visual receptora but with an 
artificial visual cortex, which will transíate the light impressions on 
his new receptora into a form so related to the normal output of his 
visual cortex that objects which ordinarily look alike will now sound 
alike. 

Thus the criterion of the possibility of such a replacement of sight 
by hearing is at least in part a comparison between the number of 
recognizably different visual patterns and recognizably different 
auditory patterns at the cortical level. This is a comparison of 
amounts of information. In view of the somewhat similar organiza- 
tion of the different parts of the sensory cortex, it will probably not 
differ very much from a comparison between the areas of the two 
parts of the cortex. This is about 100:1 as between sight and sound. 
If all the auditory cortex were used for visión, we might expect to 
get a quantity of reception of information about 1 per cent of that 
coming in through the eye. On the other hand, our usual scale for 
the estimation of visión is in terms of the relative distance at which 
a certain degree of resolution of pattem is obtained, and thus a 
10/100 visión means an amount of flow of information about 1 per 

1 Personal communication of Dr. W. Grey Walter, of Bristol, England. 
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cent of normal. This is very poor visión; it is, however, definitely 
not blindness, ñor do people with this amount of visión necessarily 
consider themselves as blind. 

In the other direction, the picture is even more favorable. The 
eye can detect all of the nuances of the ear with the use of only 1 
per cent of its facilitics, and still lea ve a visión of about 95/100, 
which is substantially perfect. Thus the problem of sensory pros- 
thesis is an extremely hopeful field of work. 



VII 


Cybernetics and Psychopathology 


It is necessary that I commence this chapter with a disavowal. 
On the one hand, I am not a psychopathologist ñor a psychiatrist, 
and lack any experience in a field where the guidance of experience 
is the only trustworthy one. On the other hand, our knowiedge of 
the normal performance of the brain and the nervous system, and a 
fortiori our knowiedge of their abnormal performance, is far from 
having reached that state of perfection where an a priori theory can 
command any confidence. I therefore wish to disclaim in advance 
any assertion that any particular entity in psychopathology, as for 
example any of the morbid conditions described by Kraepelin and 
his disciples, is due to a specific type of defect in the organizaron of 
the brain as a computing machine. Those who may draw such 
specific conclusions from the considerations of this book do so at their 
own risk. 

Nevertheless, the realization that the brain and the computing 
machine have much in common may suggest new and valid ap- 
proaches to psychopathology and even to psychiatrics. These 
begin with perhaps the simplest question of all: how the brain 
avoids gross blunders, gross miscarriages of activity, due to the 
malfunction of individual components. Similar questions referring 
to the computing machine are of great practical importance, for 
here a chain of operations, each covering a fraction of a miliisecond, 
may last a matter of hours or days. It is quite possible for a chain 
of computational operations to involve 10 9 sepárate steps. Under 
these circumstances, the chance that at least one operation will go 
amiss is very far from negligible, even though, it is trae, the reliability 
144 



CYBERNETICS AND PSYCHOPATHOLOGY 


145 


of modern electronic apparatus has far exceeded the most sanguine 
expectations. 

In ordinary computational practice by hand or by desk machines, 
it is the custom to check every step of the computation and, when an 
error is found, to localize it by a backward procesa starting from the 
first point where the error is noted. To do this with a high-speed 
machine, the check must proceed with the speed of the original 
machine, or the whole effective order of speed of the machine wiil 
conform to that of the slower process of checking. Furthermore, if 
the machine is made to keep ail intermedíate records of its per¬ 
formance, its complication and bulk wiil be increased to an intolerable 
point, by a factor which is likely to be enormously greater than 2 or 3. 

A much better method of checking, and in fact the one generally 
used in practice, is to refer every operation simultaneously to two or 
three sepárate mechanisms. In the case of the use of two such 
mechanisms, their answers are automatically collated against each 
other; and if there is a discrepancy, all data are transferred to per- 
manent storage, the machine stops, and a signal is sent to the 
operator that something is wrong. The operator then compares the 
results, and is guided by them in his search for the malfunctioning 
part, perhaps a tu be which has burnt out and needs replacement. If 
three sepárate mechanisms are used for each stage and single mis- 
functions are as rare as they are in fact, there wiil practically always 
be an agreement between two of the three mechanisms, and this 
agreement wiil give the required result. In this case, the collation 
mechanism accepts the majority report, and the machine need not 
stop; but there is a signal indicating where and how the minority 
report differs from the majority report. If this occurs at the first 
moment of discrepancy, the indication of the position of the error 
may be very precise. In a well-designed machine, no particular 
element is assigned to a particular stage in the sequence of operations, 
but at each stage there is a searching process, quite similar to that 
used in automatic telephone exchanges, which finds the first available 
element of a given sort and switches it into the sequence of operations. 
In this case, the removal and replacement of defective elemente 
need not be the source of any appreciable delay. 

It is conceivable and not implausible that at least two of the 
elements of this process are also represented in the nervous system. 
We can hardly expect that any important message is entrusted for 
transmission to a single neuron, ñor that any important operation is 
entrusted to a single neurona! mechanism. Like the computing 
machine, the brain probably works on a variant of the famous 



146 


CYBERNETICS 


principie expounded by Lewis Carroll in The Hunting of tke Snark: 

‘ ‘ What I tell you three times is true.” It is aiso improbable that the 
various channels available for the transfer of information generally 
go from one end of their course to the other without anastomosing. 
It is much more probable that when a message comes in to a certain 
level of the nervous system, it may lea ve that point and proceed to 
the next by one or more alternative members of what is known as 
an “ internuncial pool.” There may be parts of the nervous system, 
indeed, where this interchangeability is much limited or abolished, 
and these are likely to be such highly specialized parts of the cortex 
as those which serve as the inward extensions of the organs of 
special sense. Stiíl, the principie holds, and probably holds most 
clearly for the relatively unspecialized cortical areas which serve the 
purpose of association and of what we cali the higher mental func- 
tions. 

So far we have been considering errors in performance which are 
normal, and pathological only in an extended sense. Let us now 
turn to those which are much more clearly pathological. Psycho- 
pathology has been rather a disappointment to the instinctive 
materialism of the doctore, who have taken the point of view that 
every disorder must be accompanied by material lesions of some 
specific tissue involved. It is true that specific brain lesions, such 
as injuries, tumors, clots, and the like, may be accompanied by psychic 
symptoms, and that certain mental diseases, such as paresis, are the 
sequellae of general bodily disease and show a pathological condition 
of the brain tissue; but there is no way of identifying the brain of a 
schizophrenic of one of the strict Kraepelin types, ñor of a manic- 
depressive patient, ñor of a paranoiac. These disordera we cali 
functional, and this distinction seems to contra vene the dogma of 
modern materialism that every disorder in function has some physio- 
logical or anatómica! basis in the tissues concerned. 

This distinction between functional and organic disordere receives 
a great deal of light from the consideration of the computing machine. 
As we have already seen, it is not the empty physical structure of the 
computing machine that corresponda to the brain—to the adult 
brain, at least—but the combination of this structure with the 
instructions given it at the beginning of a chain of operations and 
with all the additional information stored and gained from outside 
in the course of this chain. This information is stored in some physi¬ 
cal form—in the forra of memory—but part of it is in the form of 
circulating memo ríes, with a physical basis which vanishes when the 
machine is shut down or the brain dies. and part in the form of 
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long-time memories, whieh are stored in a way at which we can only 
guess, but probably aiso in a form with a physical basis which 
vanishes at death. There is no way yet known for us to recognize in 
the cadáver what the threshoid of a given synapse has been in life; 
and even if we knew this, there is no way we can trace out the chain 
of neurona and synapses communieating with this, and determine the 
significance of this chain for the ideational content which it records. 

There is therefore nothing surprising in considering the functionai 
mental disorders as fundamentaliy diseases of memory, of the 
circulating ¡nformation kept by the brain in the active State, and of 
the long-time permeability of synapses. Even the grosser disorders 
such as paresis may produce a large part of their effects not so much 
by the destruction of tissue which they involve and the alteration of 
svnaptic thresholds as by the secondary disturbances of traffic—the 
overload of what remains of the nervous system and the re-routing 
of messages—which must follow such primary injuries. 

In a system containing a large number of neurons, circular 
processes can hardly be atable for long periods of time. Either, as in 
the case of memories belonging to the specious present, they run 
their course, dissipate themselves, and die out, or they comprehend 
more and more neurons in their system, until they occupy an in- 
ordinate part of the neuron pool. This is what we should expect to 
be the case in the malignant worry which accompanies anxiety 
neuroses. In such a case, it is possible that the patient simply does 
not have the room, the sufficient number of neurons, to carry out his 
normal processes of thought. Under such conditions, there may be 
iess going on in the brain to load up the neurons not yet affected, so 
that they are all the more readily involved in the expanding process. 
Furthermore, the permanent memory becornes more and more deeply 
involved, and the pathological process which occurred at first at the 
level of the circulating memories may repeat itself in a more intrac- 
table form at the level of the permanent memories. Thus what 
started as a relatively trivial and accidental reversal of stability may 
build itself up into a process totally destructive to the ordinary 
mental life. 

Pathological processes of a somewhat similar nature are not 
unknown in the case of mechanical or electrical computing machines. 
A tooth of a wheel may slip under just such conditions that no tooth 
with which it engages can pulí it back into its normal relations, or a 
high-speed electrical computing machine may go into a circular 
process which there seems to be no way to stop. These contingencies 
may depend on a highly improbable instantaneous configuration 
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of the system, and, when remedied, may never—or very rarely— 
repeat themselves. However, when they occur, they temporarily 
put the machine out of action. 

How do we deal with these accidents in the use of the machine? 
The first thing which we try is to elear the machine of ali information, 
in the hope that when it starts again with different data the difficuity 
may not recur. Faiiing this, if the difficuity is in some point 
permanently or temporarily inaccessible to the clearing mechaniam, 
we shake the machine or, if it is electrical, subject it to an abnormally 
large electrical impulse, in the hope that we may reach the inaccessible 
part and throw it into a position where the faise cycle of its activities 
will be interrupted. If even this fails, we may disconnect an erring 
part of the apparatus, for it is possible that what yet remains may be 
adequate for our purpose. 

Now there is no normal procesa except death which completely 
clears the brain from ali past impressions; and after death, it is 
impossible to set it going again. Of all normal processes, sleep 
comes the nearest to a non-pathological clearing. How often we 
find that the best way to handle a complicated worry or an intellectual 
muddle is to sleep over it! However, sleep does not elear a way the 
deeper memories, ñor indeed is a sufficiently malignant state of 
worry compatible with an adequate sleep. We are thus often forced 
to resort to more violent types of intervention in the memory cycle. 
The more violent of these involve a surgical intervention into the 
brain, leaving behind it permanent damage, mutilation, and the 
abridgment of the powers of the victim, as the mammalian central 
nervous system seems to possess no powers whatever of regeneration. 
The principal type of surgical intervention which has been practicad 
is known as prefrontal lobotomy, and consists in the removal or 
isolation of a portion of the prefrontal lobe of the cortex. It has 
recently been having a certain vogue, probably not unconnected 
with the fact that it makes the custodial care of many patients 
easier. Let me remark in passing that killing them makes their 
custodial care still easier. However, prefrontal lobotomy does seem 
to have a genuine effect on malignant worry, not by bringing the 
patient nearer to a solution of his problema but by damaging or 
destroying the capacity for maintained worry, known in the termi- 
nology of another profession as the conscie.nct. More generally, it 
appears to limit all aspeets of the circulating memory, the ability to 
keep in mind a situation not actually presented. 

The various forms of shock treatment—electric, insulin, metrazol 
—are less drastic methods of doing a very similar thing. They do 
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not destroy brain tissue or at leasfc are not intended to destroy it, 
bufc they do have a decidedly damaging effect on the memory. In 
so far as this concerns the circuiating memory, and in so far as this 
memory is chiefly damaged for the reeent period of mental disorder, 
and is probably scarcely worth preserving anyhow, shock treatment 
has something definite to recommend it as against lobotomy; but it 
is not always free from deleterious effects on the permanent memory 
and the personaiity. As it stands at present, it is another violent, 
imperfectly understood, imperfectly controlled method to interrupt 
a mental vicious circle. This does not prevent its being in many 
cases the best thing we can do at present. 

Lobotomy and shock treatment are methods which by their very 
nature are more suited to handle vicious circuiating memories and 
malignant worries than the deeper-seated permanent memories, 
though it is not impossible that they may have some effect here too. 
As we have said, in long-established cases of mental disorder, the 
permanent memory is as badly deranged as the circuiating memory. 
We do not seem to possess any pureiy pharmaceutical or surgical 
weapon for intervening differentially in the permanent memory. 
This is where psychoanalysis and other similar psychotherapeutic 
measures come in. Whether psychoanalysis is taken in the orthodox 
Freudian sense or in the modified senses of Jung and of Adler, or 
whether our psychotherapy is not strictly psychoanalytic at all, our 
treatment is ciearly based on the concept that the stored information 
of the mind lies on many levels of accessibility and is much richer 
and more varied than that which is accessible by direct unaided 
introspection; that it is vitally conditioned by affective experiences 
which we cannot always uncover by such introspection, either 
because they never were made explicit in our adult language, or 
because they have been buried by a definite mechanism, affective 
though generally involuntary ; and that the contení of these stored 
experiences, as well as their affective tone, conditions much of our 
later activity in ways which may well be pathological. The technique 
of the psychoanalyst consists in a series of means to discover and 
interpret these hidden memories, to make the patient accept them 
for what they are and by their acceptance modify, if not their 
contení, at least the affective tone they carry, and thus make them 
less harmful. All this is perfectly consistent with the point of view 
of this book. It perhaps explains, too, why there are circumstances 
where a joint use of shock treatment and psychotherapy is indicated, 
combining a physical or pharmacological therapy for the phenomena 
of reverberation in the ñervous system, and a psychological therapy 
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for the iong-time memories which, without interference, might 
re-establish from within the vicious circle broken up by the shock 
treatment. 

We have already mentioned the traffic problem of the nervous 
System. It has been coaimented on by many writers, such as 
D’Arcy Thompson, 1 that each form of organization has an upper 
limit of size, beyond which it wili not function. Thus the insect 
organization is limited by the length of tubing over which the 
spiracle method of bringing air by diífusion directly to the breathing 
tissues will function; a land animal cannot be so big that the legs or 
cther portions in contact with the ground will be erushed by its 
weight; a tree is limited by the mechanism for transferring water 
and minerals from the roots to the leaves, and the products of 
photosynthesis from the leaves to the roots; and so on. The same 
sort of thing is observed in engineering constructions. Skyscrapers 
are limited in size by the fact that when they exceed a certain height, 
the elevator space needed for the upper stories consumes an excessive 
part of the cross section of the lower floors. Beyond a certain span, 
the best-possible suspensión bridge which can be built out of materials 
with given elastic properties will collapse under its own weight; and 
beyond a certain greater span, any structure built of a given material 
or materials will collapse under its own weight. Similarly, the size 
of a single telephone central, built according to a constant, non- 
expanding plan, is limited, and this limitation has been very 
thoroughly studied by telephone engineers. 

In a telephone system, the important limiting factor is the fraction 
of the time during which a subscriber will find it impossible to put a 
cal! through. A 99 per cent chance of success will certainly be 
satisfactory for even the most exacting; 90 per cent of successful calis 
is probabiy good enough to permit business to be carried on with 
reasonable facility. A success of 75 per cent is already annoying but 
will permit business to be carried on after a fashion; while if half the 
calis end in failures, subscribers will begin to ask to have their tele- 
phones taken out. Now, these represent over-all figures. If the 
calis go through n distinct stages of swifcching, and probability of 
failure is independent and equal for each stage, in order to get a 
probability of total success equal to p, the probability of success at 
each stage must be p l ¡ n . Thus to obtain a 75 per cent chance of the 
completion of the cali after five stages, we must have about 95 per 
cent chance of success per stage. To obtain a 90 per cent perform- 

1 Thompson, D’Arcy, On Grouth and Form, Amer. ed., The Macmillan Company, 
New York, 1942. 
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anee, we must ha ve 98 per cent chance of success at each stage. To 
obtain a 50 per cent performance, we must ha ve 87 per cent chance 
of success at each stage. It will be seen that the more stages which 
are involved, the more rapidly the service becomes extremely bad 
when a critical level of failure for the individual cali is exceeded, and 
extremely good when this critical level of failure is not quite reached. 
Thus a switching service involving many stages and designed for a 
certain ievei of failure shows no obvious signs of failure until the 
traffic comes up to the edge of the critica! point, when it goes com- 
pletely to pieces, and we have a catastrophic traffic jam. 

Man, with the best-developed nervous system of all the animáis, 
with behavior that probably depends on the longest chains of 
effectively operated neuronic chains, is then likely to perform a 
complicated type of behavior efficiently very cióse to the edge of an 
overload, when he will give way in a serious and catastrophic way. 
This overload may take place in several ways: either by an excess in 
the amount of traffic to be carried, by a physical remo val of channeis 
for the carrying of traffic, or by the excessive occupation of such 
channeis by undesirable systems of traffic, like circulating memories 
which have increased to the extent of becoming pathological worries. 
In all these cases, a point will come—quite suddenly—when the 
normal traffic will not have space enough allotted to it, and we shall 
have a form of mental breakdown, very possibly amounting to 
insanity. 

This will first affect the faculties or operations involving the longest 
chains of neurons. There is appreciable evidence that these are 
precisely the processes which are recognized to be the highest in 
our ordinary scale of valuation. The evidence is this: a rise in 
temperature within nearly physiological limits is known to produce 
an increase in the ease of performance of most if not of all neuronic 
processes. This is greater for the higher processes, roughly in the 
order of our usual estímate of their degree of “highness.” Now, 
any facilitation of a process in a single neuron-synapse system should 
be cumulative as the neuron is combined in series with other neurons. 
Thus the amount of assistance a process receives through a rise in 
temperature is a rough measure of the length of the neuron chain it 
involves. 

We thus see that the superiority of the human brain to others in 
the length of the neuron chains it employs is a reason why mental 
disorders are certainly most conspicuous and probably most common 
in man. There is another more specific way of considering a very 
similar matter. Let us first consider two brains geometrically 
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similar, wifch the weighta of gray and of white matter related by the 
same factor of proportionality, but with different linear dimensions 
in the ratio A\B. Let the volume of the cell bodies in the gray 
matter and the cross sections of the fibers in the white matter be of 
the same size in both brains. Then the number of cell bodies in the 
two cases bears the ratio A 3 :B 3 , and the number of long-distance 
connectors the ratio A 2 :B 2 . This means that for the same density 
of activity in the cells, the density of activity in the fibers is A :B 
times as great in the case of the large brain as in that of the small 
brain. 

If we compare the human brain with that of a lower mammal, we 
shall find that it is much more convoluted. The relativo thickness 
of the gray matter is much the same, but it is spread o ver a far 
more in volved system of gyri and sulci. The effect of this is to 
increase the amount of gray matter at the expense of the amount of 
white matter. Within a gyrus, this decrease of the white matter is 
largely a decrease in length rather than in number of fibers, as the 
opposing folds of a gyrus are nearer together than they would be on a 
smooth-surfaced brain of the same size. On the other hand, when 
it comes to the connectors between different gyri, the distance they 
have to run is increased if anything by the convolution of the brain. 
Thus the human brain would seem to be fairly efficient in the matter 
of the 8hort-distance connectors, but quite defectivo in the matter 
of long-distance trunk lines. This means that in case of a traífic 
jam the processes involving parts of the brain quite remóte from one 
another should suffer first. That is, processes involving several 
centers, a number of different motor processes, and a considerable 
number of association areas should be among the least stable in cases 
of insanity. These are precisely the processes which we should 
normally class as higher, and we obtain another confirmation of our 
expectation, which seems to be verified by experience, that the higher 
processes deteriórate first in insanity. 

There is some evidence that the long-distance paths in the brain 
have a tendency to run outside of the cerebrum altogether and to 
traverse the lower centers. This is indicated by the remarkably 
small damage done by cutting some of the long-distance cerebral 
loops of white matter. It almost seems as if these superficial 
connections were so inadequate that they furnish only a small part 
of the connections really needed. 

With reference to this, the phenomena of handedness and of 
hemispheric dominance are interesting. Handedness seems to occur 
in the lower mammals, though it is less conspicuous than in man, 
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probably in part becayse of the lower degree of organization and skili 
demanded by the tasks which they perform. Nevertheless, the 
choice between the right and the ieft side in muscular skill does 
actually seem to be less than in man even in the lower primates. 

The right-handedness of the normal man, as is well known, is 
generally associated with a left-brainedness, and the left-handedness 
of a minority of humans with a right-brainedness. That is, the 
cerebral functions are not distributed eveniy over the two hemi- 
spheres, and one of these, the dominant hemisphere, has the lion’s 
share of the higher functions. It is true that many essentially 
bilateral functions—those involving the fields of visión, for example— 
are represented each in its appropriate hemisphere, though this is 
not true for all bilateral functions. However, most of the “higher” 
areas are confined to the dominant hemisphere. For example, in 
the adult, the effect of an extensive injury in the secondary hemi¬ 
sphere is far less serious than the effect of a similar injury in the 
dominant hemisphere. At a relatively early age in his career, 
Pasteur suffered a cerebral hemorrhage on his right side which left 
him with a modérate degree of one-sided paralysis, a hemiplegia. 
When he died, his brain was examined, and he was found to be 
suffering from a right-sided injury, so extensive that it has been said 
that after his injury “he had only half a brain.” There certainly 
were extensive lesions of the parietal and temporal regions. Never¬ 
theless, after this injury he did some of his best work. A similar 
injury of the left side in a right-handed adult would almost certainly 
have been fatal and would certainly reduce the patient into an animal 
condition of mental and nervous crippledness. 

It is said that the situation is considerably better in early infancy, 
and that in the first six months of life an extensive injury to the 
dominant hemisphere may compel the normally secondary hemi¬ 
sphere to take its place; so that the patient appears far more nearly 
normal than he would be had the injury occurred at a later stage. 
This is quite in accordance with the general great ñexibility shown 
by the nervous system in the early weeks of life, and the great rigidity 
which it rapidly develops later. It is possible that, short of such 
serious injuries, handedness is reasonably flexible in the very young 
child. However, long before the child is of school age, the natural 
handedness and cerebral dominance are estabiished for life. It used 
to be thought that left-handedness was a serious social disadvantage. 
With most tools, school desks, and sports equipment primarily made 
for the right-handed, it certainly is to some extent. In the past, 
moreover, it was viewed with some of the superstitious disapproval 



154 


CYBERNETICS 


that has attached to so many minor variations from the human 
norm, such as birthmarks or red hair. From a combination of 
motives, many people ha ve attempted and even succeeded, in 
changing the external handedness of their children by education, 
though of course they couid not ehange its physiologicaí basis in 
hemispheric dominance. It was then found that in very many cases 
these hemispheric changelings suffered from stuttering and other 
defects of speech, reading, and writing, to the extent of seriously 
wounding their prospects in life and their hopes for a normal career. 

We now see at least one possibie explanation for the phenomenon. 
With the education of the secondary hand, there has been a partial 
education of that part of the secondary hemisphere which deais with 
skilled motions, such as writing. Since, however, these motions are 
carried out in the closest possibie association with reading, speech, 
and other activities which are inseparably connected with the domi- 
nant hemisphere, the neuron chains invoived in processes of the sort 
must cross over from hemisphere to hemisphere and back; and in a 
process of any complication, they must do this again and again. 
Now, the direct connectors between the hemispheres—the cerebral 
commissures—in a brain as large as that of man are so few in number 
that they are of very little use, and the interhemispheric traffic must 
go by roundabout routes through the brain stem, which we know 
very imperfectly but which are certainly long, scanty, and subject to 
interruption. As a consequence, the processes associated with 
speech and writing are very likely to be involved in a traffic jam, 
and stuttering is the most natural thing in the world. 

That is, the human brain is probably too large already to use in an 
efficient manner all the facilities which seem to be anatomically 
present. In a cat, the destruction of the dominant hemisphere 
seems to produce relatively less damage than in man, and the 
destruction of the secondary hemisphere probably more damage. 
At any rate, the apportionment of function in the two hemispheres 
is more nearly equal. In man, the gain achieved by the increase in 
size and complication of the brain is partly nuliified by the fact that 
less of the organ can be used effectively at one time. It is interesting 
to reflect that we may be facing one of those limitations of nature in 
which highly specialized organs reach a level of declining efficiency 
and ultimately lead to the extinction of the species. The human 
brain may be as far along on its road to this destructive specialízation 
as the great nose horns of the last of the titanotheres. 



VIII 


Information, Language, 
and Society 


The concept of an organization, the elementa of which are them- 
selves 8mall organizations, is neither unfamiliar ñor new. The loose 
federations of ancient Greece, the Holy Román Empire and its 
similarly constituted feudal contemporaries, the Swíss Companions 
of the Oath, the United Netheriands, the United States of America, 
and the many United States to the south of it, the Union of Socialist 
Soviet República, are all exampies of hierarchies of organizations on 
the political sphere. The Leviathan of Hobbes, the Man-State made 
up of lesser men, is an illustration of the same idea one stage lower 
in scale, while Leibniz’s treatment of the living organism as being 
realiy a plenum, wherein other living organisms, such as the blood 
corpuscles, have their life, is but another step in the same direction. 
It is, in fact, scarcely more than a philosophical anticipation of the 
cell theory, according to which most of the animáis and plants of 
modérate size and all of those of large dimensions are made up of 
units, celis, which have many if not all the attributes of independent 
living organism. The multiceliuiar organisms may themselves be 
the building bricks of organisms of a higher stage, such as the Portu- 
guese man-of-war, which is a complex structure of differentiated 
coelenterate polyps, where the several individuáis are modified in 
different ways to serve the nutrition, the support, the locomotion, 
the excretion, the reproduction, and the support of the coiony as a 
whole. 

Strictly speaking, such a physieally conjoint coiony as that poses 
155 
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no question of organization which is philosophically deeper than those 
which arise at a lower level of individuality. It is very different 
with man and the other social animáis—with the herds of baboons 
or cattie, the beaver colonies, fche hives of bees, the nests of wasps or 
ants. The degree of integration of the life of the community may 
very weli approach the level shown in the conduct of a single individual, 
yet the individual will probably have a fixed nervous system, with 
permanent topographic relations between the elements and per- 
manent connections, while the community consists of individuáis 
with shifting relations in space and time and no permanent, unbreak- 
able physical connections. All the nervous tissue of the beehive is 
the nervous tissue of some single bee. How then does the beehive act 
in unisón, and at that in a very variable, adapted, organized unisón? 
Obviously, the secret is in the intercommunication of its members. 

This intercommunication can vary greatly in complexity and 
content. With man, it embraces the whole intricacy of language 
and literature, and very much besides. With the ants, it probably 
does not cover much more than a few smells. It is very improbable 
that an ant can distinguish one ant from another. It certainly can 
distinguish an ant from its own nest from an ant from a foreign nest, 
and may cooperate with the one, destroy the other. Within a few 
outside reactions of this kind, the ant seems to have a mind almost 
as patterned, chitin-bound, as its body. It is what we might expect 
a priori from an animal whose growing phase and, to a large extent, 
whose learning phase are rigidly separated from the phase of mature 
activity. The only means of communication we can trace in them 
are as general and diffuse as the hormonal system of communication 
within the body. Indeed, smell, one of the Chemical senses, general 
and undirectional as it is, is not unlike the hormonal influences 
within the body. 

Let it be remarked parenthetically that musk, civet, caatoreum, 
and the like sexually attractive substances in the mammals may be 
regarded as communal, exterior hormones, indispensable, especially 
in solitary animáis, for the bringing the sexes together at the proper 
time, and serve for the continuation of the race. By this I do not 
mean to assert that the inner action of these substances, once they 
reach the organ of smell, is hormonal rather than nervous. It is hard 
to see how it can be purely hormonal in quantities as small as those 
which are readily perceivable; on the other hand, we know too little 
of the action of the hormones to deny the possibility of the hormonal 
action of vanishingly small quantities of such substances. Moreover, 
the long, twisted rings of carbón atoms found in muskone and civetone 
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do not need too much rearrangement to form the linked ring structure 
characteristic of the sex hormones, some of the vitamins, and some 
of the carcinógena. I do not care to pronounce an opinión on this 
matter; I lea ve it as an interesting speculation. 

The odora perceived by the ant seem to lead to a highly standard- 
ized course of conduct; but the valué of a simple stimulus, such aa an 
odor, for conveying information dependa not only on the information 
conveyed by the stimulus itself but on the whole nervous constitution 
of the sender and the receiver of the stimulus as weli. Suppose I 
find myself in the woods with an intelligent savage who cannot speak 
my language and whose language I cannot speak. Even without any 
code of sign language common to the two of us, I can learn a great 
deal from him. All I need to do is to be alert to those moments 
when he shows the signs of emotion or interest. I then cast my eyes 
around, perhaps paying special attention to the direction of his 
glance, and fix in my memory what I see or hear. It will not be 
long before I discover the things which seem important to him, not 
because he has communicated them to me by language, but because 
I myself have observed them. In other words, a signal without an 
intrinsic content may acquire meaning in his mind by what he 
observes at the time, and may acquire meaning in my mind by what I 
observe at the time. The ability that he has to pick out the moments 
of my special, active attention is in itself a language as varied in 
possibilities as the range of impressions that the two of us are able 
to encompass. Thus social animáis may have an active, intelligent, 
flexible means of communication long before the development of 
language. 

Whatever means of communication the race may have, it is 
possible to define and to measure the amount of information avail- 
able to the race and to distinguish it from the amount of information 
available to the individual. Certainly no information available to 
the individual is also available to the race unless it raodifies the 
behavior of one individual to another, ñor is even that behavior of 
racial significance unless it is distinguishable by other individuáis 
from other forms of behavior. Thus the question as to whether a 
certain piece of information is racial or of purely prívate availability 
depends on whether it resulta in the individual assuming a form of 
activity which can be recognized as a distinct form of activity by 
other members of the race, in the sense that it will in tum affeet 
their activity, and so on. 

I have spoken of the race. This is realíy too broad a term for the 
scope of most communal information. Properly speaking, the 
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eommunity extends only so far as there extends an effectual trans- 
mission of Information. It is possible to give a sorfc of measure to 
this by comparing the number of decisions entering a group from 
outside with the number of decisions made in the group. We can 
thus measure the autonomy of the group. A measure of the effective 
size of a group is given by the size which it must have to have 
achieved a certain stated degree of autonomy. 

A group may have more group information or less group informa- 
tion than its members. A group of non-social anímala, temporarily 
assembled, contains very little group information, even though its 
members may possess much information as individuáis. This is 
because very little that one member does is noticed by the others and 
is acted on by them in a way that goes further in the group. On the 
other hand, the human organism contains vastly more information, 
in all probability, than does any one of its cells. There is thus no 
necessary relation in either direction between the amount of racial 
or tribal or eommunity information and the amount of information 
available to the individual. 

As in the case of the individual, not all the information which is 
available to the race at one time is accessible without special effort. 
There is a well-known tendeney of libraries to become clogged by 
their own volume; of the Sciences to develop such a degree of special- 
ization that the expert is often illiterate outside his own minute 
specialty. Dr. Vannevar Bush has suggested the use of mechanical 
aids for the searching through vast bodies of material. These 
probably have their uses, but they are limited by the impossibility 
of classifying a book under an unfamiliar heading unless some 
particular person has already recognized the relevance of that heading 
for that particular book. In the case where two subjeets have the 
same techniques and intellectual content but belong to widely 
separated fields, this still requires some individual with an almost 
Leibnizian catholicity of interest. 

In connection with the effective amount of communal information, 
one of the most surprising faets about the body politic is its extreme 
lack of efficient homeostatic processes. There is a belief, current in 
many countries, which has been elevated to the rank of an official 
article of faith in the United States, that free competí ti on is ifcself a 
homeostatic process: that in a free market the individual selfishness of 
the bargainers, each seeking to sel! as high and buy as low as possible, 
will result in the end in a stable dynamics of prices, and with redound 
to the greatest common good. This is associated with the very 
comforting view that the individual entrepreneur, in seeking to 
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forward his own interest, is in some manner a public benefactor and 
has thus eamed the great rewards with which society has showered 
him. Unfortunately, the evidenee, such as it is, is against this simple* 
minded theory. The market is a game, which has indeed received 
a simuiacrum in the family game of Monopoly. It is thus strictly 
subject to the general theory of games, developed by von Neumann 
and Morgenstern. This theory is based on the assumption tnat each 
player, at every stage, in view of the information then available to 
him, plays in accordance with a completely intelligent policy, which 
will in the end assure him of the greatest possibie expectation of 
reward. It is thus the market game as played between perfectly 
intelligent, perfectly ruthless operators. Even in the case of two 
playera, the theory is complicated, although it often leads to the 
choice of a definite line of play. In many cases, however, where 
there are three players, and in the overwhelming majority of cases, 
when the number of players is large, the result is one of extreme 
indeterminacy and instability. The individual players are compelled 
by their own cupidity to form coalitions; but these coalitions do not 
generally establish themselves in any single, determinate way, and 
usually termínate in a welter of betrayal, turncoatism, and deception, 
which is only too true a picture of the higher business life, or the closely 
related lives of politics, diplomacy, and war. In the long run, even 
the most brilliant and unprincipled huckster must expect ruin; but 
let the hucksters become tired of this and agree to live in peace with 
one another, and the great rewards are reserved for the one who 
watches for an opportune time to break his agreement and betray 
his companions. There is no homeostasis whatever. We are 
involved in the business cycles of boom and failure, in the successions 
of dictatorship and revolution, in the wars which everyone loses, 
which are so real a feature of modem times. 

Naturally, von Neumann’s picture of the player as a completely 
intelligent, completely ruthless person is an abstraction and a 
perversión of the facts. It is rare to find a large number of thoroughly 
clever and unprincipled persons playing a game together. Where the 
knaves assemble, there will always be fools; and where the fools are 
present in sufficient numbers, they oífer a more profitable object of 
exploitation for the knaves. The psychology of the fool has become 
a subject well worth the serious attention of the knaves. Instead 
of looking out for his own ultímate infcerest, after the fashion of von 
Neumann’s gamesters, the fool operates in a manner which, by and 
large, is as predictable as the struggles of a rat in a maze. This 
policy of lies—or rather, of statements irreievant to the truth—will 
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make him buy a particular brand of cigarettes; that poiicy will, or 
so the party hopes, induce him to vote for a particular candidate— 
any candidate—or to join in a poiitical witch hunt. A certain 
precise mixture of religión, pornography, and pseudo Science will 
sell an illustrated newspaper. A certain blend of wheedling, 
bribery, and intimidation will induce a young scientist to work on 
guided missiies or the atomic bomb. To determine these, we have 
our machinery of radio fan ratings, straw votes, opinión samplings, 
and other psychological investigations, with the common man as 
their object; and there are always the statisticians, sociologists, 
and economista available to sell their Services to these undertakings. 

Luckily for us, these merchante of lies, these exploitersof gullibility, 
have not yet arrived at such a pitch of perfection as to have things all 
their own way. This is because no man is either all fool or all knave. 
The average man is quite reasonably intelligent concerning subjects 
which come to his direct attention and quite reasonably altruistic 
in matters of public benefit or prívate suffering which are brought 
before his own eyes. In a small country community which has been 
running long enough to have developed somewhat uniform leveis of 
inteliigence and behavior, there is a very respectable standard of care 
for the unfortunate, of administration of roads and other public 
facilities, of tolerance for those who have offended once or twice 
against society. After all, these people are there, and the rest of the 
community must continué to live with them. On the other hand, 
in such a community, it does not do for a man to have the habit of 
overreaching his neighbors. There are ways of making him feel the 
weight of public opinión. After a while, he will find it so ubiquitous, 
so unavoidable, so restricting and oppressing that he will have to 
leave the community in self-defense. 

Thus small, closely knit communities have a very considerable 
measure of homeostasis; and this, whether they are highly literate 
communities in a civilized country or villages of primitive savages. 
Strange and even repugnant as the customs of many barbarians may 
seem to us, they generally have a very definite homeostatic valué, 
which it is part of the function of anthropologists to interpret. It 
is only in the large community, where the Lords of Things as They 
Are protect themselves from hunger by weaíth, from public opinión 
by privacy and anonymity, from prívate criticism by the laws of 
libel and the possession of the means of communication, that ruthless- 
ness can reach its most sublime leveis. Of all of these anti-homeo- 
static factors in society, the control of the means of communication 
is the most effective and most important. 
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One of the lessons of the present book is that any organism is 
held together in fchis action by the possession of means for the 
acquisition, use, retention, and transmission of information. In a 
soeiety too large for the direct contact of its members, these means 
are the press, both as it conceras books and as it concerns news- 
papers, the radio, the telephone system, the telegraph, the posts, the 
theater, the movies, the sehools, and the church. Besides their 
intrinsic importance as means of communication, each of these serves 
other, secondary functions. The newspaper is a vehicle for ad- 
vertisement and an instrument for the monetary gain of its proprietor, 
as are also the movies and the radio. The school and the church 
are not merely refuges for the acholar and the saint: they are also the 
home of the Great Educator and the Bishop. The book that does 
not eara money for its publisher probabiy does not get printed and 
certainly does not get reprinted. 

In a soeiety iike ours, avowedly based on buying and selling, in 
which all natural and human resources are regarded as the absolute 
property of the first business man enterprising enough to exploit 
them, these secondary aspeets of the means of communication tend 
to encroach further and further on the primary ones. This is aided 
by the very elaboration and the consequent expense of the means 
themselves. The country paper may continué to use its own reporters 
to canvass the villages around for gossip, but it buys its national 
news, its syndicated features, its political opinions, as stereotyped 
“ boiler píate.” The radio depende on its advertisers for income, and, 
as everywhere, the man who pays the piper calis the tune. The 
great news Services cost too much to be available to the publisher 
of modérate means. The book publishers concéntrate on books that 
are likely to be acceptable to some book club which buys out the 
whole of an enormous edition. The college president and the 
Bishop, even if they have no personal ambitions for power, have 
expensive institutions to run and can only seek their money where the 
money is. 

Thus on all sides we have a triple constriction of the means of 
communication: the elimination of the less profitable means in 
favor of the more profitable; the fact that these means are in the 
hands of the very limited class of wealthy men, and thus naturally 
express the opinions of that class; and the further fact that, as one 
of the chief avenues to political and personal power, they attract 
above all those ambitious for such power. That system which 
more than all others should confcribute to social homeostasis is 
thrown directly into the hands of those mosfc concerned in the game 
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of power and money, which we have already seen to be one of the 
ehief anti-homeostatic elementa in the community. It is no wonder 
then thafc fche larger communities, subject to this disruptive influence, 
contain far leas communally available Information than the smalier 
communities, to say nothing of the human elementa of which all 
communities are built up. Like the wolf pack, although let us 
hope to a lesser extent, the State is stupider than most of its com- 
ponents. 

This runa counter to a tendeney much voiced among business 
executives, heads of great laboratories, and the like, to assume that 
because the community is larger than the individual it is also more 
intelligent. Some of this opinión is due to no more than a childish 
delight in the large and the lavish. Some of it is due to a sense of the 
possibilities of a large organization for good. Not a littie of it, 
however, is nothing more than an eye for the main chance and a lust- 
ing after the fleshpots of Egypt. 

There is another group of those who see nothing good in the 
anarchy of modera society, and in whom an optimistic feeling that 
there must be some way out has led to an overvaluation of the 
possible homeostatic elements in the community. Much as we may 
sympathize with these individuáis and appreciate the emotional 
dilemma in which they find themselves, we cannot attribute too 
much valué to this type of wishful thinking. It is the mode of thought 
of the mice when faced with the problem of belling the cat. Un- 
doubtedly it would be very pleasant for us mice if the predatory cats 
of this world were to be belled, but—who is going to do it? Who is to 
assure us that ruthless power will not find its way back into the 
hands of those most avid for it? 

I mention this matter because of the considerable, and I think 
false, hopes which some of my friends have built for the social 
efficacy of whatever new ways of thinking this book may contain. 
They are certain that our control over our material environment has 
far outgrown our control over our social environment and our 
understanding thereof. Therefore, they consider that the main 
task of the immediate future is to extend to the fields of anthropology, 
of sociology, of economice, the methods of the natural Sciences, in 
the hope of achieving a like measure of success in the social fields. 
From believing this necessary, they come to believe it possible. In 
this, I maintain, they show an excessive optimism, and a mis- 
understanding of the nature of all scientific achievement. 

All the great successes in precise Science have been made in fields 
where there is a certain high degree of isolation of the phenomenon 
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from the observer, We have seen in the case of astronomy that this 
may result from the enormous scale of certain phenomena with 
respect to man, so that man’s mightiest efforts, not to speak of his 
mere glance, cannot make the siightest visible impression on the celes¬ 
tial world. In módem atomic physics, on the other hand, the 
Science of the unspeakably minute, it is true that any thing we do will 
have an influence on many individual partióles which is great from 
the point of view of that partióle. However, we do not iive on the 
scale of the partióles concerned, either in space or in time; and the 
events that might be of the greatest significance from the point of 
view of an observer conforming to their scale of existence appear to 
us—with some exceptions, it is true, as in the Wilson cloud-chamber 
experimenta—only as average mass effects in which enormous 
populations of partióles cooperate. As far as these effects are 
concerned, the intervals of time concerned are large from the point 
of view of the individual partióle and its motion, and our statistical 
theories have an admirably adequate basis. In short, we are too 
small to influence the stars in their courses, and too large to care about 
anything but the mass effects of molecules, atoms, and electrons. 
In both cases, we achieve a sufficiently loose coupling with the 
phenomena we are studying to give a massive total account of this 
coupling, although the coupling may not be loose enough for us to 
be able to ignore it altogether. 

It is in the social Sciences that the coupling between the observed 
phenomenon and the observer is hardest to minimize. On the one 
hand, the observer is able to exert a considerable influence on the 
phenomena that come to his attention. With all respect to the 
intelligence, skill, and honesty of purpose of my anthropologist 
friends, I cannot think that any community which they have in- 
vestigated will ever be quite the same afterward. Many a missionary 
has fixed his own misunderstandings of a primitive language as law 
eternal in the process of reducing ifc to writing. There is much in the 
social habits of a people which is dispersed and distorted by the mere 
act of making inquines about it. In another sense from that in 
which it is usually stated, traduttore traditore. 

On the other hand, the social scientist has not the advantage of 
looking down on his subjects from the coid heights of eternity and 
ubiquity. It may be that there is a mass sociology of the human 
animalcule, observed like the populations of Drosophila in a bofctie, 
but this is not a sociology in which we, who are human animaicules 
ourselves, are particularly interesfced. We are not much concerned 
about human rises and faíis, pleasures and agonies, svh specie 
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aeternitatis. Your anthropologist reporta the cuatoms assoeiated 
wifch the life, educatión, career, and death of people whose iife scaie 
ia much the same as his own. Your economist ia most interested in 
predicting such business cyciea as run their course in lesa than a 
generation or, at least, have repercussions which affect a man 
differentialiy at different stages of his career. Few phiiosophera of 
politics nowadays care to confine their investigations to the world of 
Ideas of Plato. 

In ofcher words, in the social Sciences we have to deal with short 
statistical runs, ñor can we be sure that a considerable part of what 
we observe is not an artifact of our own creation. An investigation 
of the stock market is likely to upset the stock market. We are too 
much in tune with the objects of our investigation to be good probes. 
In short, whether our investigations in the social Sciences be statistical 
or dynamic—and they should particípate in the nature of both— 
they can never be good to more than a very few decimal places, and, 
in short, can never fumish us with a quantity of verifiable, significant 
information which begins to compare with that which we have 
leamed to expect in the natural Sciences. We cannot afford to neglect 
them; neither should we build exaggerated expectations of their 
possibilities. There is much which we must leave, whether we like 
it or not, to the un-“scientific,” narrative method of the professional 
historian. 


Note 

There is one question which properly beiongs to this chapter, 
though it in no sense represents a culmination of its argument. It 
is the question whether it is possible to construct a chess-playing 
machine, and whether this sort of ability represents an essential 
difference between the potentialities of the machine and the mind. 
Note that we need not raise the question as to whether it is possible 
to construct a machine which will play an optimum game in the sense 
of von Neumann. Not even the best human brain approximates to 
this. At the other end, it is unquestionably possible to construct a 
machine that will play chess in the sense of following the rules of the 
game, irrespective of the merit of the play. This is essentially no 
more difficult than the construction of a system of interlocking 
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signáis for a railway signal tower, The real problem is intermedíate: 
to eonstruct a machine whieh shall offer interesting opposition to a 
player at some one of the many levels at which human chess players 
find themselves. 

I think it is possible to eonstruct a relatively crude but not 
altogether trivial apparatus for this purpose. The machine must 
actually play—at a high speed if possible—all its own admissible 
moves and all the opponent’s admissible ripostes for two or three 
moves ahead. To each sequence of moves it should assign a certain 
conventional valuation. Here, to checkmate the opponent receives 
the highest valuation at each stage, to be checkmated, the lowest; 
while losing pieces, taking opponent’s pieces, checking, and other 
recognizable situations should receive valuations not too remóte from 
those which good players would assign them. The first of an entire 
sequence of moves should receive a valuation much as von Neumann’s 
theory would assign it. At the stage at which the machine is to 
play once and the opponent once, the valuation of a play by the 
machine is the minimum valuation of the situation after the opponent 
has made all possible plays. At the stage where the machine is to 
play twice and the opponent twice, the valuation of a play by the 
machine is the minimum with respect to the opponent’s first play of 
the máximum valuation of the plays by the machine at the stage when 
there is only one play of the opponent and one by the machine to 
follow. This process can be extended to the case when each player 
makes three plays, and so on. Then the machine chooses any one 
of the plays giving the máximum valuation for the stage n plays 
ahead, where n has some valué on which the designer of the machine 
has decided. This it makes as its definitive play. 

Such a machine would not only play legal chess, but a chess not so 
manifestly bad as to be ridiculous. At any stage, if there were a 
mate possible in two or three moves, the machine would make it; 
and if it were possible to avoid a mate by the enemy in two or three 
moves, the machine would avoid it. It would probably win over a 
stupid or careless chess player, and would almost certainly lose to a 
careful player of any considerable degree of proficiency. In other 
words, it might very well be as good a player as the vast majority of 
the human race. This does not mean that it would reach the degree 
of proficiency of Maelzel’s fraudulent machine, but, for all that, it 
may attain a pretty fair level of accomplishment. 
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IX 


On Learning 

and Self-Reproducing Machines 


Two of the phenomena which we consider to be characteristie of 
living Systems are the power to learn and the power to reproduce 
themselves. These properties, different as they appear, are inti- 
mately related to one another. An animal that learns is one which is 
capable of being transformed by its past environment into a different 
being and is therefore adjustable to its environment within its 
individual lifetime. An animal that multiplies is able to create other 
animáis in its own likeness at least approximately, although not so 
completely in its own likeness that they cannot vary in the course of 
time. If this variation is itself inheritable, we ha ve the raw material 
on which natural selection can work. If the hereditary invariability 
concerns manners of behavior, then among the varied patterns of 
behavior which are propagated some will be found advantageous to 
the continuing exiBtence of the race and will establish themselves, 
while others which are detrimental to this continuing existence will 
be eliminated. The result is a certain sort of racial or phylogenetic 
learning, as contrasted with the ontogenetic learning of the individual. 
Both ontogenetic and phylogenetic learning are modes by which the 
animal can adjust itself to its environment. 

Both ontogenetic and phylogenetic learning, and certainly the 
latter, extend themselves not only to all animáis but to plants and, 
indeed, to all organisms which in any sense may be considered to be 
living. However, the degree to which these two forms of learning 
are found to be important in different sorts of living beings varíes 
widely. In man, and to a lesser extent in the other mammals, 
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ontogenetic learning and individual adaptabiiity are raised to the 
highest point. Indeed, it may be said that a iarge part of the phylo* 
genetic iearning of man has been devoted to estabiishing the pos- 
sibility of good ontogenetic learning. 

It has been pointed out by Julián Huxley in his fundamental 
paper on the mind of birds 1 that birds have a small capacity for 
ontogenetic learning. Something similar is true in the case of insects, 
and in both instances it may be associated with the terrific demanda 
made on the individual by flight and the consequential pre-emption 
of the capabilities of the nervous system which might otherwise be 
applied to ontogenetic learning. Complicated as the behavior 
patterns of birds are—in flying, in courtship, in the care of the young, 
and in nest building—they are carried out correctly the very first 
time without the need of any large amount of instruction from the 
mother. 

It is altogether appropriate to devote a chapter of this book to 
these two related subjects. Can man-made machines learn and can 
they reproduce themselves? We shall try to show in this chapter 
that in fact they can learn and can reproduce themselves, and we 
shall give an account of the technique needed for both these activities. 

The simpler of these two processes is that of learning, and it is 
there that the technical development has gone furthest. I shall taik 
here particularly of the learning of game-playing machines which 
enables them to improve the strategy and tactics of their performance 
by experience. 

There is an established theory of the playing of games—the von 
Neumann theory . 2 It concerns a policy which is best considered by 
working from the end of the game rather than from the beginning. 
In the last move of the game, a player strives to make a winning 
move if possible, and if not, then at least a drawing move. His 
opponent, at the previous stage, strives to make a move which will 
prevent the other player from making a winning or a drawing move. 
If he can himself make a winning move at that stage, he will do so, 
and this will not be the next to the last but the last stage of the game. 
The other player at the move before this will try to act in such a way 
that the very best resources of his opponent will not prevent him 
from ending with a winning move, and so on backward. 

There are games such as ticktacktoe where the entire strategy 
is known, and it is possible to start this policy from the very be- 

1 Huxley, J., Evolution: The Módem Synlhesis, Harper Bros., New York, 1943. 

2 von Neumann, J., and O. Morgenstern, Theory of Oamee and Economic Behavior, 
Princeton University Press, Princeton, N.J., 1944. 
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ginning. When this is feasible, it is manifestly the best way of 
playing the game. However, in many games like chess and checkere 
our knowledge is not sufficient to permit a complete strategy of this 
sort, and then we can only approximate to it. The von Neumann 
type of approximate theory tends to lead a player to act with the 
utmost caution, assuming that his opponent is the perfectly wise 
sort of a master. 

This attitude, however, is not always justified. In war, which is a 
sort of game, this will in general lead to an indecisivo action which will 
often be not much better than a defeat. Let me give two historical 
examples. When Napoleón fought the Austrians in Italy, it was 
part of his effectiveness that he knew the Austrian mode of military 
thought to be hidebound and traditional, so that he was quite 
justified in assuming that they were incapable of taking advantage of 
the new decision-compelling methods of war which had been devel- 
oped by the soldiers of the French Revolution. When Nelson fought 
the combined fleets of continental Europe, he had the advantage of 
fighting witli a naval machine which had kept the seas for years and 
which had developed methods of thought of which, as he was well 
aware, his enemies were incapable. If he had not made the fullest 
possible use of this advantage, instead of acting as cautiously as he 
would have had to act under the supposition that he was facing an 
enemy of equal naval experience, he might have won in the long run 
but could not have won so quickly and decisively as to establish the 
tight naval blockade which was the ultimate downfall of Napoleón. 
Thus, in both cases, the guiding factor was the known record of the 
commander and of his opponents, as exhibited statistically in the past 
of their actions, rather than an attempt to play the perfect game 
against the perfect opponent. Any direct use of the von Neumann 
method of game theory in these cases would have proved futile. 

In a similar way, books on chess theory are not written from the 
von Neumann point of view. They are compendia of principies 
drawn from the practical experience of chess playera playing against 
other chess playera of high quality and wide knowledge; and they 
establish certain valúes or weightings to be given to the loss of each 
piece, to mobility, to command, to development, and to other factora 
which may vary with the stage of the game. 

It is not very diffieult to make machines which will play chess of a 
sort. The mere obedience to the laws of the game, so that only legal 
moves are made, is easily within the power of quite simple computing 
machines. Indeed, it is not hard to adapt an ordinary digital 
machine to these purposes. 
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Now comes the question of policy within the rules of the game. 
Every evaluation of pieces, command, mobility, and so forth, is 
intrinsically capable of being reduced to numerical terms; and when 
this is done, the maxims of a chess book may be used for the de- 
termination of the best moves of each stage. Such machines have 
been made; and they will play a very fair amateur chess, although at 
present not a game of master caliber. 

Imagine youreelf in the position of playing chess against such a 
machine. To make the situation fair, let us suppose you are playing 
correspondence chess without the knowledge that it is such a machine 
you are playing and without the prejudices that this knowledge may 
excite. Naturaily, as always is the case with chess, you will come to 
a judgment of your opponent’s chess personality. You will find that 
when the same situation comes up twice on the chessboard, your 
opponent’s reaction will be the same each time, and you will find that 
he has a very rigid personality. If any trick of yours will work, then 
it will always work under the same conditions. It is thus not too 
hard for an expert to get a Une on his machine opponent and to 
defeat him every time. 

However, there are machines that cannot be defeated so trivially. 
Let us suppose that every few games the machine takes time off and 
uses its facilities for another purpose. This time, it does not play 
against an opponent, but examines all the previous games which it 
has recorded on its memory to determine what weighting of the 
different evaluations of the worth of pieces, command, mobility, and 
the like, will conduce most to winning. In this way, it learns not 
only from its own failures but its opponent’s successes. It now 
replaces its earlier valuations by the new ones and goes on playing as 
a new and better machine. Such a machine would no longer have as 
rigid a personality, and the tricks which were once successful against 
it will ultimately fail. More than that, it may absorb in the course 
of time something of the policy of its opponents. 

All this is very difficult to do in chess, and as a matter of fact the 
ful! development of this technique, so as to give rise to a machine 
that can play master chess, has not been accomplished. Checkers 
offers an easier problem. The homogeneity of the valúes of the 
pieces greatly reduces the number of combinations to be considered. 
Moreover, partly as a consequence of this homogeneity, the checker 
game is much less divided into distinct stages than the chess game. 
Even in checkers, the main problem of the end game is no longer to 
take pieces but to establish contact with th^enemy so that one is in 
a position to take pieces. Similarly, the valuation of moves in the 
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chess game musfc be made independently for the different sfcages. 
Not only ís the end game differenfc from the middle game in the con- 
siderations whieh are paramount, but the openings are mueh more 
devoted to getting the pieces into a position of free mobility for 
attack and defense than is the middle game. The result is that we 
cannot be even approximately content with a uniform evaluation of 
the various weighting factors for the game as a whole, but must 
divide the leaming procesa into a number of sepárate stages. Only 
then can we hope to construct a leaming machine which can play 
máster chess. 

The idea of a first-order programming, which may be linear in 
certain cases, combined with a second-order programming, which 
uses a much more extensive segment of the past for the determination 
of the policy to be carried out in the first-order programming, has 
been mentioned earlier in this book in connection with the problem 
of prediction. The predictor uses the immediate past of the flight 
of the airplane as a tool for the prediction of the future by means of 
a linear operation; but the determination of the correct linear 
operation is a statistical problem in which the long past of the flight 
and the past of many similar flights are used to give the basis of the 
statistics. 

The statistical studies necessary to use a long past for a determina¬ 
tion of the policy to be adopted in view of the short past are highly 
non-linear. As a matter of fact, in the use of the Wiener-Hopf 
equation for prediction, 1 the determination of the coefficients of this 
equation is carried out in a non-linear manner. In general, a leaming 
machine operates by non-linear feedback. The checker-playing 
machine described by Samuel 2 and Watanabe 3 can leam to defeat 
the man that programmed it in a fairly consistent way on the basis 
of from 10 to 20 operating hours of programming. 

Watanabe’s philosophical ideas on the use of programming 
machines are very exciting. On the one hand, he is treating a method 
of proving an elementary geometrical theorem which shall conform 
in an optimal way according to certain criteria of elegance and 
simplicity, as a learning game to be played not against an individual 
opponent but against what we may cali “Colonel Bogey.” A similar 

1 Wiener, N., Extrapolation, Interpolation, and Smoothing of Stationary Time Series 
with Engineering Applications, The Technology Press of M.I.T. and John Wiley & 
Sons, New York, 1949. 

2 Samuel, A. L., “ Some Studies in Machine Leaming, Using the Game of Checkers,” 
IBM Journal of Research and Development, S, 210-229 {1959). 

3 Watanabe, S., “Information Theoretical Analysis of Muitivariate Correlation,” 
IBM Journal of Research and Development, 4, 66-82 {1960). 
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game which Watanabe is studying is played in logical induction, 
when we wish to build up a theory which is óptima! in a similar 
quasi-aesthetic way, on the basis of an evaiuation of economy, 
direetness, and the like, by the deterraination of the evaiuation of a 
finite number of parameters which are !eft free. This, it is true, is 
only a iimited logical induction, but it is well worth studying. 

Many forms of the activity of struggie, which we do not ordinarily 
consider as games, have a great deal of light thrown on them by the 
theory of game-piaying machines. One interesting example is the 
fight between a mongoose and a snake. As Kipling points out in 
“Rikki-Tikki-Tavi,” the mongoose is not immune to the poison of 
the cobra, although it is to some extent protected by its coat of stiff 
hairs which makes it difficult for the snake to bite home. As 
Kipling States, the fight is a dance with death, a struggie of muscular 
skill and agility. There is no reason to suppose that the individual 
motions of the mongoose are faster or more accurate than those of the 
cobra. Yet the mongoose almost invariably kills the cobra and comes 
out of the contest unscathed. How is it able to do this? 

I am here giving an account which appears valid to me, from 
having seen such a fight, as well as motion pictures of other such 
fights. I do not guarantee the correctness of my observations as 
interpretations. The mongoose begins with a feint, which provokes 
the snake to strike. The mongoose dodges and makes another such 
feint, so that we have a rhythmical pattern of activity on the part of 
the two animáis. However, this dance is not static but develops 
progressively. As it goes on, the feints of the mongoose come earlier 
and earlier in phase with respect to the darts of the cobra, until 
finally the mongoose attacks when the cobra is extended and not in 
a position to move rapidiy. This time the mongoose’s attack is not 
a feint but a deadly accurate bite through the cobra’s brain. 

In other words, the snake’s pattern of action is confined to single 
darts, each one for itself, whiíe the pattern of the mongoose’s action 
involves an appreciable, if not very long, segment of the whole past 
of the fight. To this extent the mongoose acts like a learning ma¬ 
chine, and the real deadliness of its attack is dependent on a much 
more highly organized nervous system. 

As a Walt Disney movie of severa! years ago showed, something 
very similar happens when one of our western birds, the road runner, 
attacks a rattlesnake. While the bird fights with beak and claws, 
and a mongoose with its teeth, the pattern of activity is very similar. 
A bullfight is a very fine example of the same thing. For it must be 
remembered that the bullfight is not a sport but a dance with death, 
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fco exhibit the beauty and the interlaced coordinating actions of the 
bull and the man. Fairness to the bull has no part in it, and we can 
leave out from our poinfc of view the preliminary goading and weaken- 
ing of the bull, which have the purpose of bringing the contest to a 
leve! where the interaction of the patterns of the two participants is 
most highly developed. The skilled bullfighter has a large repertory 
of possible actions, such as the flaunting of the cape, various dodges 
and pirouettes, and the like, which are intended to bring the bull 
into a position in which it has completed its rush and is extended at 
the precise moment that the bullfighter is ready to plunge the 
estoque into the bull’s heart. 

What I have said concerning the fight between the mongoose and 
the cobra, or the toreador and the bull, will also apply to physical 
contests between man and man. Consider a duel with the small- 
sword. It consiste of a sequence of feints, parnés, and thrusts, with 
the intention on the part of each participant to bring his opponent’s 
sword out of line to such an extent that he can thrust home without 
laying himself open to a double encounter. Again, in a champion- 
ship game of tennis, it is not enough to serve or return the hall 
perfectly as far as each individual stroke is considered; the strategy 
is rather to forcé the opponent into a series of returns which put him 
progressively in a worse position until there is no way in which he can 
return the ball safely. 

These physical contests and the sort of games which we have 
supposed the game-playing machine to play both have the same 
element of learning in terms of experience of the opponent’s habits as 
well as one’s own. What is true of games of physical encounter is 
also true of contests in which the inteilectual element is stronger, such 
as war and the games which simúlate war, by which our staff officers 
win the elements of their military experience. This is true for 
classical war both on land and at sea, and is equally true with the new 
and as yet untried war with atomic weapons. Some degree of 
mechanization, parallel to the mechanization of checkers by learning 
machines, is possible in all these. 

There is nothing more dangerous to contémplate than World War 
III. It is worth considering whether part of the danger may not be 
intrinsic in the unguarded use of learning machines. Again and 
again I have heard the statement that learning machines cannot 
subject us to any new dangers, because we can turn them off when 
we feel like it. But can we? To turn a machine off effectively, we 
must be in possession of information as to whether the danger point 
has come. The mere fact that we have made the machine does not 
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guarantee that we shall have the proper Information to do this. This 
is already implicit in the sfcatemenfc that the checker-piaying machine 
can defeat the man who has programmed it, and this after a very 
limited time of working in. Moreover, the very speed of operation of 
modera digital machines stands in the way of our ability to perceive 
and think through the indications of danger. 

The idea of non-human devices of great power and great ability 
to carry through a policy, and of their dangers, is nothing new. Ail 
that is new is that now we possess effective devices of this kind. In 
the past, similar possibilities were postulated for the techniques of 
magic, which forms the theme for so many legends and folk tales. 
These tales have thoroughly explorad the moral situation of the 
magician. I have already discussed some aspects of the legendary 
ethics of magic in an earlier book entitled The Human Use of Human 
Beings. 1 I here repeat some of the material which I have discussed 
there, in order to bring it out more precisely in its new context of 
learning machines. 

One of the best-known tales of magic is Goethe’s “The Sorcerer’s 
Apprentice.” In this, the sorcerer leaveshis apprentice and factótum 
alone with the chore of fetching water. As the boy is lazy and 
ingenious, he passes the work over to a broom, to which he has 
uttered the words of magic which he has heard from his master. The 
broom obligingly does the work for him and will not stop. The boy 
is on the verge of being drowned out. He finds that he has not 
learned, or has forgotten, the second incantation which is to stop 
the broom. In desperation, he takes the broomstick, breaks it over 
his knee, and finds to his consternation that each half of the broom 
continúes to fetch water. Luckiiy, before he is completely destroyed, 
the master returns, says the Words of Power to stop the broom, and 
administers a good scolding to the apprentice. 

Another story is the Arabian Nights tale of the fisherman and the 
genie. The fisherman has dredged up in his net a jug ciosed with 
the seal of Solomo». It is one of the vessels in which Solomon has 
imprisoned the rebellious genie. The genie emerges in a cloud of 
smoke, and the gigantic figure tells the fisherman that, whereas in his 
first years of imprisonment he had resolved to reward his rescuer with 
power and fortune, he has now decided to slay him out of hand. 
Luckiiy for himself, the fisherman finds a way to talk the genie back 
into the bottle, upon which he casts the jar to the bottom of the 
ocean. 

1 Wiener, N., The Human Use of Human Beings; Cybernelics andSociety, Houghton 
Mifflin Company, Boston, 1950. 
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More terrible than eifcher of these two tales is the fable of the 
monkey’s paw, written by W. W. Jacobs, an English writer of the 
beginning of the eentury. A retired English worídngman is sitting 
at his tabie with his wife and a friend, a returned British sergeant- 
major from India. The sergeant-major shows his hosts an amulet in 
the form of a dried, wizened monkey’s paw. This has been endowed 
by an Indian holy man, who has wished to show the folly of defying 
fate, with the power of granting three wishes to each of three people. 
The soldier says that he knows nothing of the first two wishes of the 
first owner, but the last one was for death. He himself, as he tells 
his friends, was the second owner but wili not talk of the horror of 
his own experiences. He casts the paw into the fire, but his friend 
retrieves it and wishes to test its powers. His first is for £200. 
Shortly thereafter there is a knock at the door, and an official of the 
company by which his son is employed enters the room. The father 
learns that his son has been killed in the machinery, but that the 
company, without recognizing any responsibility or legal obligation, 
wishes to pay the father the sum of £200 as a solatium. The grief- 
stricken father makes his second wish—that his son may return—and 
when there is another knock at the door and it is opened, something 
appears which, we are not told in so many words, is the ghost of the 
son. The final wish is that this ghost should go away. 

In all these stories the point is that the agencies of magic are 
literal-minded; and that if we ask for a boon from them, we must ask 
for what we really want and not for what we think we want. The 
new and real agencies of the learning machine are also literal-minded. 
If we program a machine for winning a war, we must think well what 
we mean by winning. A learning machine must be programmed by 
experience. The only experience of a nuclear war which is not 
immediately catastrophic is the experience of a war game. If we 
are to use this experience as a guide for our procedure in a real 
emergency, the valúes of winning which we have employed in the 
programming games must be the same valúes which we hold at heart 
in the actual outcome of a war. We can fail in this only at our 
immediate, utter, and irretrievable peril. We cannot expect the 
machine to follow us in those prejudices and emotional compromises 
by which we enable ourseives to cali destruction by the ñame of 
victory. If we ask for victory and do not know what we mean by it, 
we shall find the ghost knocking at our door. 

So much for learning machines. Now let me say a word or two 
about self-propagating machines. Here both the words machine and 
self-propagating are important. The machine is not only a form of 
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matter, but an ageney for accomplishing certain definifce purposes. 
And self-propagafcion is not merely the creafcion of a tangible replica; 
it is the creation of a replica capable of the same functions. 

Here, two different points of view come into evidence. One of 
these is purely combinatoria! and concerns the question whether a 
machine can have enough parts and sufficiently complicated structure 
to enable self-reproduction to be among its functions. This question 
has been answered in the affirmative by the late John von Neumann. 
The other question concerns an actual operative procedure for build- 
ing self-reproducing machines. Here I shall confine my attentions 
to a class of machines which, while it does not embrace all machines, 
is of great generality. I refer to the non-linear transducer. 

Such machines are apparatuses which have as an input a single 
function of time and which have as their output another function 
of time. The output is completely determined by the past of the 
input; but in general, the adding of inputs does not add the corre¬ 
spondió? outputs. Such pieces of apparatus are known as trans- 
ducers. One property of all transducers, linear or non-linear, is an 
invariance with respect to a translation in time. If a machine 
performs a certain function, then, if the input is shifted back in time, 
the output is shifted back by the same amount. 

Basic to our theory of self-reproducing machines is a canonical 
form of the representaron of non-linear transducers. Here the 
notions of impedance and admittance, which are so essential in the 
theory of linear apparatus, are not fully appropriate. We shall 
have to refer to certain newer methods of carrying out this representa- 
tion, methods developed partly by me 1 and partly by Professor 
Dennis Gabor 2 of the University of London. 

While both Professor Gabor’s methods and my own lead to the 
construction of non-linear transducers, they are linear to the extenfc 
that the non-linear transducer is represented with an output which 
is the sum of the outputs of a set of non-linear transducers with the 
same input. These outputs are combined with varying linear co- 
efficients. This allows us to employ the theory of linear develop- 
ments in the design and specification of the non-linear transducer. 
And in particular, this method allows us to obtain coefficients of the 
constituent elements by a least-square process. If we join this to a 

1 Wiener, N., Nonlinear Problema in Random Theory, The Technology Press of 
M.I.T. and John Wiley <fe Sons, Ine., New York, 1958. 

2 Gabor, D., “Electronic Inventions and Their Impact on Civilization,” Inaugural 
Lecture, March 3, 1959, Imperial College of Science and Technology, University of 
London, England. 
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method of statistieally averaging over fche sefc of all inputs to our 
apparatus, we have essentially a branch of the theory of orthogonal 
deveiopment. Such a statistical basis of the theory of non-linear 
transducers can be obtained from an actual study of the past statisties 
of the inputs used in each particular case. 

This is a rough account of Professor Gabor’s methods. While 
mine are essentially similar, the statistical basis for my work is 
síightly different. 

It is well known that electrical eurrents are not conducted con- 
tinuously but by a stream of electrons which must have statistical 
variations from uniformity. These statistical fluctuations can be 
represented fairly by the theory of the Brownian motion, or by the 
similar theory of shot effect or tube noise, about which I am going to 
say something in the next chapter. At any rate, apparatus can be 
made to generate a standardized shot effect with highly specific 
statistical distribution, and such apparatus is being manufacturad 
commercially. Note that tube noise is in a sense a universal input 
in that its fluctuations over a sufliciently long time will sooner or 
later approximate to any given curve. This tube noise possesses a 
very simple theory of integration and averaging. 

In terms of the statisties of tube noise, we can easily determine a 
closed set of normal and orthogonal non-linear operations. If the 
inputs subject to these operations have the statistical distribution 
appropriate to tube noise, the average product of the output of two 
component pieces of our apparatus, where this average is taken with 
respect to the statistical distribution of tube noise, will be zero. 
Moreover, the mean square output of each apparatus can be nor- 
malized to one. The result is that the deveiopment of the general 
non-linear transducer in terms of these componente results from an 
application of the familiar theory of orthonormal functions. 

To be specific, our individual pieces of apparatus give outputs 
which are produets of Hermite polynomials in the Laguerre coeffi- 
cients of the past of the input. This is presented in detail in my 
Nonlinear Problems in Random Theory. 

It is of course diñicult to average in the first instance over a set of 
possible inputs. What makes this difficult task realizable is that the 
shot-effect inputs possess the property known as metric transitivity, 
or the ergodic property. Any integrable function of the parameter 
of distribution of shot-effect inputs has in almost every instance a 
time average equal to its average over the ensemble. This permits 
us to take two pieces of apparatus with a common shot-effect 
input, and to determine the average of their product over the entire 
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ensemble of the possible inputs, by taking their product and averaging 
it over fche time. The repertory of operations needed for all these 
processes involves nothing more than the addition of potentials, the 
multiplication of potentials, and the operation of averaging over time. 
Devices exiat for all these. As a matter of fact, the elementary 
devices needed in Professor Gabor’s methodoíogy are the same as 
those needed in mine. One of his students has invented a particularly 
effective and inexpensive multiplying device depending on the piezo- 
electric eífect on a crystal of the attraction of two magnetic coila. 

What this amounts to is that we can imitate any unknown non- 
linear transducer by a sum of linear terms, each of fixed characteris- 
tics and with an adjustable coefficient. This coefficient can be 
determined as the average product of the outputs of the unknown 
transducer and a particular known transducer, when the same shot- 
eífect generator is connected to the input of both. What is more, 
instead of computing this result on the scale of an instrument and 
then transferring it by hand to the appropriate transducer, thus 
producing a piecemeal simulation of the apparatus, there is no 
particular problem in automatically effecting the transfer of the 
coefficients to the pieces of feedback apparatus. What we have 
succeeded in doing is to make a white box which can potentially 
assume the characteristics of any non-linear transducer whatever, 
and then to draw it into the similitude of a given black-box trans¬ 
ducer by subjecting the two to the same random input and connecting 
the outputs of the structures in the proper manner, so as to arrive 
at the suitable combination without any intervention on our part. 

I ask if this is philosophically very different from what is done when 
a gene acts as a témplate to form other raolecules of the same gene 
from an indeterminate mixture of amino and nucleic acids, or when a 
virus guides into its own form other moleculea of the same virus out 
of the tissues and juices of its host. I do not in the least claim that 
the details of these processes are the same, but I do claim that they 
are philosophically very similar phenomena. 
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Brain Waves 

and Self-Organizing Systems 


In the previous chapter, I discussed the problema of learning and 
self-propagation as they apply both to machines and, at least by 
analogy, to living systems. Here I shall repeat certain comments 
I made in the Preface and which I intend to put to immediate use. 
As I have pointed out, these two phenomena are closely related to 
each other, for the first is the basis for the adaptation of the individual 
to its environment by means of experience, which is what we may cali 
ontogenetic learning, while the second, as it furnishes the material 
on which variation and natural selection may opérate, is the basis of 
phylogenetic learning. As I have already mentioned, the mammals, 
in particular man,- do a large part of their adjustment to their en¬ 
vironment by ontogenetic learning, whereas the birds, with their 
highly varied patterns of behavior which are not learned in the life 
of the individual, have devoted themselves much more to phylo¬ 
genetic learning. 

We have seen the importance of non-linear feedbacks in the origi- 
nation of both processes. The present chapter is devoted to the 
study of a specific self-organizing system in which non-linear phenom¬ 
ena play a large part. What I here describe is what I believe to be 
happening in the self-organization of electroencephalograms or brain 
waves. 

Before we can discuss this matter inteiligently, I must say some- 
thing of what brain waves are and how their structure may be sub- 
jected to precise mathematical treatment. It has been known for 
many years that activity of the nervous system is accompanied by 
181 
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certain electrical potentials. The first observations in this field go 
back to the beginning of the last century and were made by Volta and 
Galvani in neuromuscular preparations from the leg of the frog. 
This was the birth of the Science of eiectrophysiology. This Science, 
however, advanced rather slowly untii the end of the first quarter of 
the present century. 

It is well worth refíecting why the development of this branch of 
physioiogy was so slow. The original apparatus used for the study 
of physiological electric potential consisted of galvanometers. These 
had two weaknesses. The first was that the entire energy involved 
in moving the coil or needle of the galvanometer carne from the nerve 
itself and was excessively minute. The second difficulty was that 
the galvanometer of those times was an instrument whose mobile 
parts had quite appreciable inertia, and a very definite restoring 
forcé was necessary to bring the needle to a well-defined position; that 
is, in the nature of the case, the galvanometer was not only a 
recording instrument but a distorting instrument. The best of the 
early physiological galvanometers was the string galvanometer of 
Einthoven, where the moving parts were reduced to a single wire. 
Excellent as this instrument was by the standards of its own time, it 
was not good enough to record small electrical potentials without 
heavy distortions. 

Thus eiectrophysiology had to wait for a new technique. This 
technique was that of electronics, and took two forms. One of these 
was based on Edison’s disco very of certain phenomena pertaining to 
the conduction of gases, and from these aróse the use of the vacuum 
tube or electric valve for amplification. This made it possible to give 
a reasonably faithful transformation of weak potentials into strong 
potentials. And so it permitted us to move the final elements of the 
recording device by the use of energy not emanating from the nerve 
but controlled by it. 

The second invention also involved the conduction of electricity in 
vacuo, and is known as the cathode-ray oscillograph. This made it 
possible to use as the moving part of the instrument a much lighter 
armature than that of any previous galvanometer, namely, a stream 
of electrons. With the aid of these two devices, separately or to- 
gether, the physiologists of this century have been able to follow 
faithfully the time course of small potentials which would have been 
completely beyond the range of accurate instrumentation possible in 
the nineteenth century. 

With these means, we have been able to obtain accurate records of 
the time course of the minute potentials arising between two elec- 
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trodes placed on the scalp or impianted in the brain. While these 
potentials had already been observed in the nineteenth century, the 
avaiiability of the new accurate records excited greafc hopes among 
the physiologists of twenty or thirty years ago. As to the possibiiities 
of using the devices for the direct study of brain activity, leaders in 
this field were Berger in Germany, Adrián and Matthews in Engiand, 
and Jasper, Da vis, and the Gibbs (husband and wife) in the United 
States. 

It must be admitted that the later development of electroenceph- 
alography has up to now been unable to fulfill the rosy hopes enter- 
tained by the eariy workers in the field. The data which they 
obtained were recorded by an ink-writer. They are very complicated 
and irregular curves; and although it was possible to discern certain 
predominating frequencies, such as the aipha rhythm of about 10 
oscillations per second, the ink record was not in a suitable form for 
further mathematical manipulation. The result is that electro- 
encephalography became more an art than a Science, and depended 
on the ability of the trained observer to recognize certain properties 
of the ink record on the basis of a large experience. This had the 
very fundamental objection of making the interpretation of the 
electroencephalograms a largely subjective matter. 

In the late twenties and the eariy thirties, I had become interested 
in the harmonio analysis of continuing processes. While the physi- 
cists had previously considered such processes, the mathematics of 
harmonio analysis had been almost confined to the study of either 
periodic processes or those which in some sense tended to zero as the 
time became large, positively or negatively. My work was the earliest 
attempt to put the harmonio analysis of continuing processes on a 
firm mathematical basis. In this, I found that the fundamental 
notion was that of autocorrelation, which had already been used by 
G. I. Taylor (now Sir Geoffrey Taylor) in the study of turbulences. 1 

This autocorrelation for a time function f(t)is represented by the 
time-mean of the product f(t + r) by f(t). It is advantageous to 
introduce complex functions of the time even though in the actual 
cases studied we are dealing with real functions. And now the auto- 
correlation becomes the mean of the product of f(t + r) with the 
conjúgate of f(t). Whether we are working with real or complex 
functions, the power spectrum of f{t ) is given by the Fourier trans- 
form of the autocorrelation. 

I have already spoken of the lack of suitability of ink records for 

1 Taylor, G. I., “Diffusion by Continuous Movements,” Proceedinga of the London 
Mathematical Society, Ser. 2, 20, 196-212 (1921-1922). 



184 


CYBERNETICS 


further mathematical manipulations. Before much could come of 
the idea of autocorrelation, it was necessary to replace these ink 
records by other records betfcer adapted to instrumentation. 

One of the best ways of recording small fluctuating electric poten- 
tials for further manipulation is the use of magnetic tape. This 
allows the storage of the fluctuating electric potential in a permanent 
form which can be used later whenever convenient. One such 
instrument was devised about a decade ago in the Research Labora- 
tory of Electronics of the Massaehusetts Institute of Technology, 
under the guidance of Professor Walter A. Rosenblith and Dr. Mary 
A. B. Brazier. 1 

In this apparatus, magnetic tape is used in its frequency-modula- 
tion form. The reason for this is that the reading of magnetic tape 
always involves a certain amount of erasure. With amplitude- 
modulation tape, this erasure gives rise to a change in the message 
carried, so that in successive readings of the tape we are actually 
following a changing message. 

In frequency modulation there is also a certain amount of erasure, 
but the instrumente by which we read the tape are relatively 
insensitive to amplitude, and read frequency only. Until the tape is 
so badly erased that it is completely unreadable, the partial erasure 
of the tape does not distort appreciably the message which it carnes. 
The result is that the tape can be read very many times with sub- 
stantially the same accuracy with which it was first read. 

As will be seen from the nature of the autocorrelation, one of the 
tools which we need is a mechanism which will delay the reading of 
tape by an adjustable amount. If a length of the magnetic tape 
record having time-duration A is piayed on an apparatus having two 
playback heads, one following the other, two signáis are generated 
which are the same except for a reiative displacement in time. The 
time displacement depends on the distance between the playback 
heads and on the tape speed, and can be varied at will. We can cali 
one of these/(í) and the other f(t + r), where r is the time displace¬ 
ment. The product of the two can be formed, for example, by using 
square-law rectifiers and linear mixers, and taking advantage of the 
identity 

4 ab « (o + 6) 2 - (a - 6) 2 (10.01) 

The product can be averaged approximately by integrating with a 
resistor-capacitor network having a time constant long compared with 

1 Barlow, J. S., and R. M. Brown, An Analog Correlator System for Brain Potentials, 
Technieal Report 300, Research Laboratory of Electronics, Cambridge, 

Mas». (1955). 
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the duration A of the sample. The resulfcing average is proportional 
fco the valué of the autocorrelation function for delay r. Repetition 
of the process for various valúes of r yields a set of valúes of the 
autocorrelation (or rather, the sampled autocorrelation over a large 
time base A). The accompanying graph, Fig. 9, shows a plot of an 
actual autocorrelation of this sort. 1 Let us note that we ha ve shown 
only half the curve, for the autocorrelation for negative times would 


Adiocobbiutioh 
Fio. 9. 

be the same as that for positivo times, at least if the curve of which we 
are taking the autocorrelation is real. 

Note that similar autocorrelation curves have been used for many 
years in optics, and that the instrument by which they have been 
obtained is the Michelson interferometer, Fig. 10. By a system of 
mirrors and lenses, the Michelson interferometer divides a beam of 
light into two parta which are sent on paths of different length and 

1 This work was undertaken with the cooperation of the Neurophysiology Labora- 
tory of the Massachusetts General Hospital and the Communications Biophysics 
Laboratory of M.I.T. 
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fchen reunited into one beam. Different path iengths result in different 
time delays, and the resuitant beam is fche sum of two replicas of the 
incoming beam, which may once more be termed f(t ) and f(t+ r). 
When the beam intensity is measured with a power-sensitive photo- 
meter, the reading of the photometer is proportional to the square of 
f{t) + f(t + t), and henee contains a term proportional to the 
autocorrelation. In other words, the intensity of the interferometer 
fringes (except for a linear transformation) will give us the auto¬ 
correlation. 


Photometer 



Michelson interferometer 
Fio. 10. 


All of this was implicit in Michelson’s work. It will be seen that, 
by carrying out a Fourier transformation on the fringes, the inter¬ 
ferometer yields us the power spectrum of the light, and is in fact a 
spectrometer. It is indeed the most accurate type of spectrometer 
known to us. 

This type of spectrometer has only come into its own in recent 
years. I am told that it is now accepted as an important tool for 
precisión measurements. The significance of this is that the tech- 
niques which I shall now present for the working up of autocorreia- 
tion records are equally applicable in spectroscopy and ofifer methods 
of pushing to the limit the Information which can be yielded by a 
spectrometer. 

Let us discuss the technique of obtaining the spectrum of a brain 
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wave from an autocorrelation. Let C{t) be an autocorrelation of 
f(t). Then C(t) can be written in the form 

C(t) = f°° e 2nia>t dF{a>) (10.02) 

J-00 

Here F is always an increasing or at least a non-decreasing function 
of cj, and we shall term it the integrated spectrum of/. In general, 
this integrated spectrum is made in three parts, combined additively. 
The line part of the spectrum increases only at a denumerable set of 
points. Take this away, and we are left with a continuous spectrum. 
This continuous spectrum itseif is the sum of two parts, one of which 
increases only over a set of measure zero, while the other part is 
absolutely continuous and is the integral of a positivo integrable 
function. 

From now on let us suppose that the first two parts of the spectrum 
—the discrete part and the continuous part which increases over a set 
of measure zero—are missing. In this case, we can write 

C(t) = J“ e 2 '<«Wa.)<ía> (10.03) 

where (f>(u>) is the spectral density. If is of Lebesgue class L 2 , 
we can write 

w) = J” C(¡)«r*><“‘¿< (10.04) 

As will be seen by looking at the autocorrelation of the brain 
waves, the predominating part of the power of the spectrum is in the 
neighborhood of 10 cycles. In such a case, will have a shape 
similar to the following diagram. 


/7\ 

/A 

-10 

0 10 


The two peaks near 10 and — 10 are mirror images of each other. 

The ways of performing a Fourier analysis numerically are various, 
including the use of integrating instrumenta and numerical computing 
processes. In both cases, it is an inconvenience to the work that the 
principal peaks are near 10 and —10 and not near 0. However, 
there are modes of transferring the harmonio analysis to the 
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neíghborhood of zero frequency which greatly cut down the work 
to be performed. Notice thafc 


4>(oj - 10 ) = 


C(t )g20frí(g—2t7Íwt dt 


(10.05) 


In other words, if we multiply C(t) by e 20nit , our new harmonio 
analysis wiil give us a band in the neíghborhood of zero frequency 
and another band in the neíghborhood of frequency +20. If we 
then perform such a multiplication and remove the +20 band by 
averaging methods equivalent to the use of a wave filter, we shall 
have reduced our harmonio analysis to one in the neighborhood of 
zero frequency. 

Now 

e 20 *« = eos 2Qnt + i sin 2(hrt (10.06) 


Therefore, the real and imaginary parta of the C{t)-e 20ntt are given, 
respectively, by C(t) eos 20rrt and iC(t) sin 20nt. The removal of 
the frequencies in the neighborhood of +20 can be performed by 
putting these two functions through a low-pass filter, which is 
equivalent to averaging them over an interval of a twentieth of a 
second or greater. 

Suppose that we have a curve where most of the power is nearly at 
a frequency of 10 eyeles. When we multiply this by the cosine or 
sine of 207rí, we shall get a curve which is the sum of two parts, 
one of them behaving locally like this: 

J, 

l 

and the other like this: 



1/20 



When we average the second curve over the time for a length of a 
tenth of a second, we get zero. When we average the first one, we 
get half of the máximum height. The result is that, by the smoothing 
of C(t) eos 20rrt and iC(t ) sin 207rí, we get, respectively, good 
approximations to the real and imaginary part of a function having 
all of its frequencies in the neighborhood of zero, and this function 
will have the distributional frequency around zero that one part of 
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the spectrum of C (¿) has around 10. Now let Ki(t) be the result of 
smoothing C{t) eos 20rrí and Kz(t) the result of smoothing C(t} sin 
207rf. We wish to obtain 


J* + iK z (t)}e~^dt 

= J* i(í) + ¿iÍ 2 (<)][cos 27 to»í — ¿sin27rw¿]d/ (10.07) 

This expression must be real, sinee it is a spectrum. Therefore, ít 
will equal 

J* Ki(t) eos 2ir<j)t di + J Kz(t) sin 2rrcüt dt (10.08) 


In other words, if we make a cosine analysis of K\ and a sine analysis 
of Kz, and add them together, we shaíl have the displaced spectrum 
of /. It can be shown that K\ will be even and Kz will be odd. 
This means that if we do a cosine analysis of K i and add or subtract 
the sine analysis of Kz, we shall obtain, respectively, the spectrum to 
the right and to the left of the central frequeney at the distance w. 
This method for obtaining the spectrum we shall describe as the 
method of heterodyning. 

In the case of autocorrelations which are locally nearly sinusoidal 
of period, say 0.1 (such as that which appears in the brain-wave auto- 
correlation of Fig. 9), the computation involved in this method of 
heterodyning may be simplified. We take our autocorrelation at 
intervals of a fortieth of a second. We then take the sequence at 0, 
1/20 second, 2/20 second, 3/20 second, and so on, and change the sign 
of those fractions with odd numerators. We average these con- 
secutively for a suitable length of run and get a quantity nearly 
equal to K\(t ). If we work similarly with the valúes at 1/40 second, 
3/40 second, 5/40 second, and so on, changing the sign of altérnate 
quantities, and perform the same averaging process as before, we 
get an approximation to Kz{t). From this stage on the procedure 
is clear. 

The justification for this procedure is that the distribution of mass 
which is 


1 at points 2ím 
~ 1 at points (2 n 4- l)tr 

while it is zero elsewhere, when it is subject to a harmonio analysis, 
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wílí contain a cosine component of frequency 1 and no sine com- 
ponent. Similariy, a distribution of masa which is 

1 at (2 n + l/2)ir 
— 1 at (2 n — 1/2 )rr 

and 

0 eisewhere 

will contain the sine component of frequency 1 and no cosine com¬ 
ponent. Both distributions will aiso contain components of fre- 
quencies N; but since the original curve which we are analyzing is 
wanting or nearly wanting at these frequencies, these terms will 
produce no effect. This greatly simplifies our heterodyning, because 
the only factors which we have to multiply by are +1 or - 1. 

We have found this method of heterodyning very useful in the 
harmoniq analysis of brain waves when we have only manual means 
at our disposal, and when the bulk of the work becomes overwhelming 
if we carry through all the details of the harmonio analysis without the 
use of heterodyning. All of our earlier work with the harmonio 
analysis of brain spectra has been done by the heterodyning method. 
Since, however, it later proved possible to obtain the use of a digital 
Computer for which reducing the bulk of the work is not such a serious 
consideration, much of our later work in harmonio analysis has been 
done directly without the use of heterodyning. There will still be 
much work to be done in places where digital computers are not 
available, so that I do not consider the heterodyning method obsoleto 
in practice. 

I am presenting here portions of a specific autocorrelation which 
we have obtained in our work. Since the autocorrelation covers a 
great length of data, it is not suitable for reproducing as a whole here, 
and we give merely the beginning, in the neighborhood r = 0, and a 
portion of it further out. 

Figure 11 represents the resulta of a harmonio analysis of the 
autocorrelation of which part is exhibited in Fig. 9. In this case, our 
result was obtained with a high-speed digital Computer, 1 but we have 
found a very good concordance between this spectrum and the one 
we obtained earlier through heterodyning methods by hand, at least 
in the neighborhood of the strong part of the spectrum. 

When we inspect the curve, we find a remarkable drop in power in 
the neighborhood of frequency 9.05 cycles per second. The point at 
which the spectrum substantially fades out is very sharp and gives 

1 The IBM-709 at the M.X.T. Computation Center was uaed. 
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an objective quantity which can be verified with much greater 
accuracy than any quantity so far occurring in electroencephalog- 
raphy. There is a certain amount of indication that in other curves 
which we have obtained, but which are of somewhat questionabie 
reiiability in their detaiis, this sudden fall-off in power is followed 
quite shortly by a sudden rise, so that between them we have a dip 
in the curve. Whether this be the case or not, there is a strong 
suggestion that the power in the peak corresponds to a puiling of the 
power away from the región where the curve is low. 



Fio. 11. 


ín the spectrum which we have obtained, it is worth noting that 
the overwhelming part of the peak lies within a range of about a third 
of a cycle. An interesting thing is that with another electro- 
encephalogram of the same subject recorded four days later, this 
approximate width of the peak is retained and there is more than a 
suggestion that the form is retained in some detail. There is also 
reason to believe that with other subjects the width of the peak wiil 
be different and perhaps narrower. A thoroughiy satisfactory 
verification of this waits for investigations yet to be made. 

It is highly desirable that the sort of work which we have mentioned 
in these suggestions be followed up by more accurate instrumental 
work with better instrumenta so that the suggestions which we here 
make can be definítely verified or definitely rejected. 
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I now wish to fcake up the sampling problem. For this I shaü ha ve 
to introduce some ideas from my previous work on integration in 
function space. 1 With the aid of this tool, we shall be abie to con- 
struct a statistical model of a continuing process with a given spec- 
trum. While this modei is not an exact replica of the process that 
generates brain waves, it is near enough to it to yieid statistically 
significant information conceming the root-mean-square error to be 
expected in brain-wave spectra such as the one aiready presented in 
this chapter. 

I here State without proof some properties of a certain real function 
x(t , a) aiready stated in my paper on generalized harmonio analysis 
and elsewhere. 1 The real function x(¿, a) is dependent on a variable 
t running from — oo to oo and a variable a running from 0 to 1. It 
representa one space variable of a Brownian motion dependent on 
the time t and the parameter a of a statistical distribution. The 
expression 


r. 


<f>(t) dx(t, a) 


(10.09) 


is defined for all functions <f>(t) of Lebesgue class L 2 from - oo to oo. 
If (¡>{t ) has a derivative belonging to L 2 , Expression 10.09 is defined as 



x(t, a)<f>'(t) dt 


( 10 . 10 ) 


and is then defined for all functions tf>(t ) belonging to L 2 by a certain 
well-defined limit process. Other integráis 




K{t I,- • t„) dx(ri, a) - ■ ■dx(r„, a) 


( 10 . 11 ) 


are defined in a similar manner. The fundamental theorem of 
which we make use is that 


J* da J / ‘ i T n) dx(ri, a)-■ -dx^n, a) (10.12) 

is obtained by putting 

-Ki(ti,- ■ •, Tn/g) = 2 ^( a i> - <^n) (10.13) 


1 Wiener, N., “Generalized Harmonic Analysis," Acta Malhematica, 55, 117-258 
(1930); Nonlinear Problema in Random Theory, The Technology Press of M.I.T. and 
John Wiley & Sons, Inc., New York, 1958. 
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where the r k are formed in ali possible ways by identifying al! pairs of 
t-he o k wíth each other (if n is even), and forming 


J-„ ■■■ 


, Tn/Zldri,- ■ ■, dr„/2 


(10.14) 


If n is odd, 

j>j: 




• •, r n ) dx{r i, a) - • • dx(r n , a) = 0 


(10.15) 


Another important theorem coneerning these stochastic integráis 
is that if ¿?{g) is a functional of g(t ), such that &[x{t , a)] is a function 
belonging fco L in a and depending only on the differences x{tz, a) 
— x(ti, a), then for each ti for almost all valúes of a 


lim ~ í J?[x(t, a)]dt = f &[x(t\ , a)] da ( 10 . 16 ) 
A-+ao -A Jo Jo 


This is the ergodic theorem of Birkhoff, and has been proved by the 
author 1 and others. 

It has been established in the ^4cía Mathematica paper already 
mentioned that if U is a real unitary transformation of the function 

m, 

f 00 UK(t)dx(t, a) = f" K(t) dx(t, (Z) (10.17) 

J-co J -00 

where differs from a only by a measure-preserving transformation 
of the interval (0, 1) into itself. 

Now let K(t) belong to L 2 , and let 

K{t) = f* q(w)e^*^dcü (10.18) 

J- 00 

in the Plancherei 2 sense. Let us examine the real function 

/((, a) = J“ Kit + t) dx(r, a) (10.19) 

which represents the response of a linear transducer to a Brownian 

3 Wiener, N., “The Ergodic Theorem,” Duke Matheniatical Journal, 5, 1-39 (1939); 
aleo in Modem Mathematica for the Engineer, E. F. Beckenbach (Ed.), McGraw-Hill, 
New York, 1956, pp. 166-168. 

2 Wiener, N., “Plancherel’s Theorem,” The Fourier Integral and Certain of Its 
Applications, The University Press, Cambridge, England, 1933, pp. 46-71; Dover 
Publieations, Inc., New York. 
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input. This will have the autocorrelafcion 

lim -i í r /(( + r, cc)Jit7¿)dt (10.20) 

T~*ao J-T 

and tbis, by the ergodic theorem, will have for almost ali valúes of a 
the valué 


J da J K(ti + r) dx(ti, a) j K(h) dx(t 2, a) 

= J K(t 4 - r)Kjt)dt (10.21) 
J -00 

The 8pectrum will then almost always be 

f e~ intwr dr f * K(t + t)7oJ) dt 
J —GO J-00 

I f® I 2 

= I K{r)e- Mwr dr\ 

- M 2 ( 10 . 22 ) 

This is, however, the true spectrum. The sampled autocorrelation 
over the averaging time A (in our case 2700 seconds) will be 

■j J o f{t + r, cc)f(t, a) dt 

— J dx{h , a) J dx(t2, a) J K(t\ + r 4- s)K(h 4 ■ s) ds 

(10.23) 

The resuiting sampled spectrum will almost always have the time 
average 


J* e~ 2 ’ ,íc “ T dr~ J ds J K(t + r + s)K(t 4 - s)dt — j?(o>)j 2 (10.24) 

That is, the sampled spectrum and the true spectrum will have the 
same time-average valué. 

For many purposes, we are interested in the approximate spectrum, 
in which the integration of r is carried out only over (0, B), where B 
is 20 seconds in the particular case we have already exhibited. Let 
us remember that f(t) is real, and that the autocorrelation is a 
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symmetrical function. Therefore, we can replace integration írom 0 
to B by integration from — B to B: 

J e -2niut J dx(ti, a) j dx(t2, a) J K(t\ + t + s) 

x K{h + s)ds (10.25) 


This wil! have as its mean 


J e~2”i^dr j K(t +r)K(t)dt — J e~ 2niur dr j \q{u))\ 2 e i, ' irw dco 


The square of the approximate spectrum taken over ( — B, B) will 


| J e~ 2 #í«r ¿ r J* dx(ti, a) J dx{t 2 , a) 

i r 

A Jo 


K(t\ + T + s)K(t2 + 5)í£¿H 2 


which will have as its mean 


J* dr J“ dr¡ -L j* ds j* da J" dh J” di, 

x [ K[t\ + t 4- s)K{ti 4- s)K(t 2 4- ti 4- o)K(t 2 + o) 

+ K{t\ + r + s)K{ti + s)K(ti + ti + o)K{t 2 4- ff) 

4- K(ti + r + s)K(t 2 4- s)K(t 2 + ti + a)K{t\ 4- cr)] 

+ J \q(<*>i)\ 2 dan j \q{ioi)\ 2 du)2 

[sin 2ttB{u)\ — w)] 2 sin 2 A7t(üji — tu 2 ) 


4- J {?(<*> i)| 2 don J \q(a)2)\ 2 dw2 

sin 2rrB(üJi + u) sin 2 itB{üj 2 - u) sin 2 Airean — cu 2 ) 
jr(cui -f u) 7 t(oj2 ~ w) 7T 2 ^4 2 (cui — C 02) 2 

(10.27) 
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It is weil known that, if m is used to express a mean, 

m[A - m(A)] 2 = m(A 2 ) - [m{A)]2 (10.28) 

Henee fche roofc-mean-square error of the approximate sampled 
spectrum will be equal to 



¡ ?( ‘ u ‘)i í dwi te< " 2)|2 dw 2 r^rJr- 

/sin 2 2ttB(wi — u) sin 2ttB{wi + u) sin 2itB{w2 — w) \ 
\ ir 2 (tüi — «) 2 ir{w\ 4- u) it((jü 2 ~ u) J 


Now, 


Thus 



(10.29) 

sin 2 Airu . 1 

(10.30) 

dU A 

sin 2 - u) 

> „2 A *(w - U ) 2 

(10.31) 


is XfA multiplied by a running weighted average of g{w). In case 
the quantity averaged is neariy constant o ver the small range 1¡A, 
which is here a reasonable assumption, we shall obtain as an approxi¬ 
mate dominant of the root-mean-square error at any point of the 
spectrum 


M 


. ... sin 2 2 rrB{w -u) 

W”)! 4 , 2(0 - U) 2 . ^ 


(10.32) 


Let us notice that if the approximate sampled spectrum has its 
máximum at u — 10, its valué there will be 


fj 


i«mi 


, sin 2vB(a 


10 ) 


dw 


(10.33) 


7r(cu - 10) 

which for smooth q(w) will not be far from j^(10)| 2 . The root-mean- 
square error of the spectrum referred to this as a unit of measurement 
will be 


M 


jg'(oj)I 4 sin 2 2itB(w — 10) 


J- mm 

and henee no greater than 


n 2 (w- 10) 2 


dw 


J2 J® sin 2 


2t tB(w - 10) 


2(üí - 10)2 


dw 


-J1 


(10.34) 


(10.35) 
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In the case we have considerad, this will be 



If we assume then that the dip phenomenon is real, or even fchafc 
the sudden fall-off which takes place in our curve at a frequency of 
about 9.05 cycles per second is real, it is worth while asking several 
physiologica! questions concerning it. The three chief questions 
concern the physiological function of these phenomena which we have 
observed, the physiological mechanism by which they are produced, 
and the possible application which can be made of these observations 
in medicine. 

Note that a sharp frequency line is equivalent to an accurate 
dock. As the brain is in some sense a control and computation 
apparatus, it is natural to ask whether other forms of control and 
computation apparatus use docks. In fact most of them do. Clocks 
are employed in such apparatus for the purpose of gating. All such 
apparatus must combine a large number of impulses into single 
impulses. If these impulses are earried by merely switching the 
Circuit on or ofF, the timing of the impulses is of small importance and 
no gating is needed. However, the consequence of this method of 
carrying impulses is that an entire Circuit is occupied until such time 
as the message is turned ofF; and this involves putting a large part of 
the apparatus out of action for an indefinite period. It is thus 
desirable in a computing or control apparatus that the messages be 
earried by a combined on-and-off signal. This immediately raleases 
the apparatus for further use. In order for this to take place, the 
messages must be stored so that they can be released simultaneously, 
and combined while they are still on the machine. For this a gating 
is needed, and this gating can be conveniently earried out by the use 
of a dock. 

It is well known that, at least in the case of the longer nerve fibers, 
nerve impulses are earried by peaks whose form is independent of the 
manner in which they are produced. The combination of these 
peaks is a function of the synaptic mechanism. In these synapses, a 
number of incoming fibers are linked to an outgoing fiber. When the 
proper combination of incoming fibers fires within a very short 
interval of time, the outgoing fiber fires. In this combination, the 
effect of the incoming fibers in certain cases is additive, so that if 
more than a certain number fire, a threshold is reached which per- 
mits the outgoing fiber to fire. In other cases some of the incoming 
fibers have an inhibitory action, absolutely preventing the firing, or 
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at any rate increasing fche threshold for the other fibers. In either 
case, a short combination period is essential, and if the incoming 
messages do not lie within this short period, they do not combine. 
It is therefore necessary to have some sort of gating mechanism to 
permit the incoming messages to arrive substantially simultaneously. 
Ofcherwise the synapse will fail to act as a combining mechanism. 1 

It is desirable, however, to have further evidence that this gating 
actually takes place. Here some work of Professor Donald B. 
Lindsley of the psychology department of the University of California 
at Los Angeles is relevant. He has made a study of reaction times 
for visual signáis. As is well known, when a visual signal arrives, the 
muscular activity which it stimulates does not occur at once, but 
after a certain delay. Professor Lindsley has shown that this delay 
is not constant, but seems to consist of three parts. One of these 
parta is of constant length, whereas the other two appear to be 
uniformly distributed over about 1/10 second. It is as if the central 
nervous system could pick up incoming impulses only every 1/10 
second, and as if the outgoing impulses to the muscles could arrive 
from the central nervous system only every 1/10 second. This is 
experimental evidence of a gating; and the association of this gating 
with 1/10 second, which is the approximate period of the central 
aípha rhythm of the brain, is very probably not fortuitous. 

So much for the function of the central alpha rhythm. Now the 
question arises concerning the mechanism producing this rhythm. 
Here we must bring up the fact that the alpha rhythm can be driven 
by flicker. If a light is flickered into the eye at intervals with a 
period near 1/10 second, the alpha rhythm of the brain is modified 
until it has a strong component of the same period as the flicker. 
Unquestionably the flicker produces an electrical flicker in the retina, 
and almost certainly in the central nervous system. 

There is, however, some direct evidence that a purely electrical 
flicker may produce an effect similar to that of the visual flicker. 
This experiment has been carried out in Germany. A room was 
made with a conducting floor and an insulated conducting metal 
píate suspended from the ceiling. Subjects were placed in this room, 
and the floor and the ceiling were connected to a generator producing 


1 This is a simplified picture of what happens, especially in the cortex, since the 
all-or-none operation of neurona dependa on their being of a aufflcient length so that 
the remaking of the form of the incoming impulses in the neuron itself approachea an 
asyinptotic form. However, in the cortex for example, owing to the shortness of the 
neurona, the necessity of synchronization still exista, although the detaiia of the pro¬ 
cesa are much more cornplicated. 
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an alternating electrical potential which may have been at a fre¬ 
quency near 10 cycles per second. The experienced effect on the 
subjects was very disturbing, in much fche same manner as the effect 
of a similar flicker is disturbing. 

It will, of course, be neeessary for these experiments to be repeated 
under more controlled conditions, and for the simultaneous electro- 
encephalogram of the subjects to be taken. However, as far as the 
experiments go, there is an indication that the same effect as that of 
the visual flicker may be generated by an eléctrica! flicker produced 
by electrostatic induction. 

It is important to observe that if the frequency of an oscillator can 
be changed by impulses of a different frequency, the mechanism 
raust be non-linear. A linear mechanism acting on an oscillation of a 
given frequency can produce only oscillation of the same frequency, 
generally with some change of phase and amplitude. This is not 
true for non-linear mechanisms, which may produce oscillations of 
frequencies which are the sum and differences of different orders, of 
the frequency of the oscillator and the frequency of the imposed 
disturbance. It is quite possible for such a mechanism to displace a 
frequency; and in the case which we have considered, this displace- 
ment will be of the nature of an attraction. It is not too improbable 
that this attraction will be a long-time or secular phenomenon, and 
that for short times this system will remain approximately linear. 

Consider the possibility that the brain contains a number of oscil- 
lators of frequencies of nearly 10 per second, and that within limita- 
tions these frequencies can be attracted to one another. Under such 
circumstances, the frequencies are Ükely to be pulled together into 
one or more little clumps, at least in certain regions of the spectrum. 
The frequencies that are pulled into these clumps will have to be 
pulled away from somewhere, thus causing gaps in the spectrum, 
where the power is lower than that which we should otherwise expect. 
That such a phenomenon may actually take place in the generation 
of brain waves for the individual whose autocorrelation is shown in 
Fig. 9 is suggested by the sharp drop in the power for frequencies 
above 9.0 cycles per second. This could not easily have been dis- 
covered with the low resolving powers of harmonio analysis used by 
earlier writers. 1 

1 I must say that some evidence of the existence of narrow central rhythms has 
been obtained by Dr. W. Grey Walter of the Burden Neurological Institute in Bristol, 
England. I am not acquainted with the full details of his methodology; however, I 
understand that the phenomenon to which he refera consista in the fact that in his 
toposcopic pictures of brain waves, as one goes out from the center, the raya indicating 
the frequency are confined to relatively narrow sectors. 
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In order that this account of the origin of brain waves should be 
tenable, we must examine the brain for the existence and nature of 
the osciiiators postuiated. Professor Rosenblith of M.I.T. has in- 
formed me of the existence of a phenomenon known as the after- 
discharge. 1 When a flash of light is delivered to the eyes, the 
potentials of the cerebral cortex which can be correlated with the 
flash do not retum immediately to zero, but go through a sequence of 
positive and negativo phases before they die out. The pattern of this 
potential can be subjected to a harmonio analysis and is found to have 
a large amount of power in the neighborhood of 10 óyeles. As far as 
this goes, it is at ieast not contradictory to the theory of brain wave 
self-organization that we have given here. The pulling together of 
these short-time osciliations into a eontinuing oscillation has been 
observed in other bodily rhythms, as for example the approximately 
23|-hour diurnai rhythm which is observed in many living beings. 2 
This rhythm is capable of being pulled into the 24-hour rhythm of 
day and night by the changes in the external environment. Biologi- 
cally it is not important whether the natural rhythm of living beings 
is precisely a 24-hour rhythm, provided it is capable of being at- 
tracted into the 24-hour rhythm by the external environment. 

An interesting experiment which may throw light on the validity 
of my hypothesis concerning brain waves could quite possibly be 
made by the study of fireflies or of other animáis such as crickets or 
frogs which are capable of emitting detectable visual or auditory im¬ 
pulses and also capable of receiving these impulses. It has often been 
supposed that the fireflies in a tree flash in unisón, and this apparent 
phenomenon has been put down to a human optical illusion. I have 
heard it stated that in the case of some of the fireflies of South- 
eastern Asia this phenomenon is so marked that it can scarcely be 
put down to illusion. Now the firefly has a double action. On the 
one hand it is an emitter of more or less periodical impulses, and on 
the other hand it possesses receptora for these impulses. Could not 
the same supposed phenomenon of the pulling together of frequencies 
take place? For this work, aecurate records of the flashings are 
necessary which are good enough to subject to an aecurate harmonio 
analysis. Moreover, the fireflies should be subjected to periodic 
light, as for example from a flashing neón tube, and we should 

1 Barlow, 3. S., “ Rhythniic Aetivity Induced by Photic Stimulation in Relation to 
Intrinaic Alpha Aetivity of the Brain in Man,” EEOClin. Neuropkyaiol., 12, 317-326 
(1980). 

2 Coid Spring Harbor Symposium on Quantitalive Bioloyy, Volume XXV (Biological 
Clocks), The Biological Laboratory, Coid Spring Harbor, L.I., N.Y., 1960. 
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determine wheiher tbis has a tendency to pulí them into frequency 
wifch itself. If this should be the case, we should try to obtain an 
accurate record of these spontaneous flashes to subject to an auto- 
correlation analysis similar to that which we have made in the case of 
the brain waves. Wíthout daring to pronounce on the outcome of 
experiments which have not been made, this line of research strikes 
me as promising and not too diflicult. 

The phenomenon of the attraction of frequencies also occurs in 
certain non-living situations. Consider a number of eléctrica! 
alternators with their frequencies controlled by governors attached 
to the prime movers. These governors hold the frequencies in com- 
paratively narrow regions. Suppose the outputs of the generators to 
be combined in parallels on busbars from which the current goes out 
to the extemal load, which will in general be subject to more or less 
random fluctuations due to the turning on and off of light and the like. 
In order to avoid the human problems of switching which occur in 
the old-fashioned sort of central station, we shall suppose the switch¬ 
ing on and off of the generators to be automatic. When the genera- 
tor Í8 brought to a speed and phase near enough to that of the other 
generators of the system, an automatic device will connect it to the 
busbars, and if by some chance it should depart too far from the 
proper frequency and phase, a similar device will automatically 
switch it off. In such a system, a generator which is tending to run 
too fast and thus to have too high a frequency takes a part of the 
load which is greater than its normal share, whereas a generator 
which is running too slow takes a leas than normal part of the load. 
The result is that there is an attraction between the frequencies of the 
generators. The total generating system acts as if it possessed a 
virtual governor, more accurate than the governors of the individual 
governors and constituted by the set of these governors with the 
mutual electrical interaction of the generators. To this the accurate 
frequency regulation of eléctrica! generating systems is at least in 
part due. It is this which makes possible the use of electrical docks 
of high accuracy. 

I therefore suggest that the output of such systems be studied both 
experimentally and theoretically in a manner parallel to that in 
which we have studied the brain waves. 

Historically it is interesting that in the early days of aíternating- 
current engineering, attempts were made to connect generators of 
the same constant-voltage type used in modern generating systems 
in series rather than in parallel. It was found that the interaction 
of the individual generators in frequency was a repulsión rather than 
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an attraction. The result was that such systems were impossibly 
unstable uniess the rotating parts of the individual generators were 
connected rigidly by a common shaffc or by gearing. On the other 
hand the parallel busbar conneetion of generators proved to have an 
intrinsic stability which made it possible to unite generators at 
different stations into a single self-containing system. To use a 
biological analogy, the parallel system had a better homeostasis than 
the series system and therefore survived, while the series system 
eliminated itself by natural selection. 

We thus see that a non-linear interaction causing the attraction of 
frequency can generate a self-organizing system, as it does, for 
exainple, in the case of the brain waves we have discussed and in the 
case of the a-c network. This possibility of self-organization is by 
no means limited to the very low frequency of these two phenomena. 
Consider self-organizing systems at the frequency level, say, of infra- 
red light or radar spectra. 

As we have stated before, one of the prime problems of biology 
is the way in which the capital substances constituting genes or 
virases, or possibly specific substances producing cáncer, reproduce 
themselves out of materials devoid of this specificity, such as a 
mixture of amino and nucleic acids. The usual explanation given is 
that one molecule of these substances acts as a témplate according to 
which the constituent’s smaller molecules lay themselves down and 
unite into a similar macromolecule. This is largely a figure of speech 
and is merely another way of describing the fundamental phenomenon 
of life, which is that other macromolecules are formed in the image of 
the existing macromolecules. However this process occurs, it is a 
dynamic process and involves forces or their equivalent. An entirely 
possible way of describing such forces is that the active bearer of 
the specificity of a molecule may lie in the frequency pattern of its 
molecular radiation, an important part of which may lie in infra-red 
electromagnetic frequency or even lower. It may be that specific 
viras substances under some circumstances emit infra-red oscilla- 
tions which have the power of favoring the formation of other 
molecules of the viras from an indifferent magma of amino acids and 
nucleic acids. It is quite possible that this phenomenon may be 
regarded as a sort of attractive interaction of frequency. As this 
whole matter is still subjudice, with the details not even formulated, 
I forbear to be more specific. The obvious way of investigating this 
is to study the absorption and emission spectra of a massive quantity 
of virus material, such as the erystal of the tobáceo mosaic viras, and 
then to observe the effeets of light of these frequencies on the produc- 
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tion of more virus from existing virus in the proper nutrient material. 
When I speak of absorption specfcra, I am talking of a phenomenon 
which is almost eertain to exist; and as to emission spectra, we ha ve 
something of the sort in the phenomenon of fluorescence. 

Any such research will involve a highly accurate method for the 
detailed examination of spectra in the presence of what would 
ordinarily be considered excessive amounts of light of a continuous 
spectrum. We have already seen that we are faced with a similar 
problem in the microanalysis of brain waves, and that the mathe- 
matics of interferometer spectrography is essentially the same as that 
which we have undertaken here. I then make the definite sugges- 
tion that the full power of this method be explored in the study of 
molecular spectra, and in particular in the study of such spectra of 
viruses, genes, and cáncer. It is premature to predict the entire 
valué of these methods both in puré biological research and in medi¬ 
cine, but I have great hopes that they may be proved to be of the 
utmost valué in both fields. 
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